Problem

Given a binary string, choose any 2 (maybe same or intersecting) substrings of that binary string such that the OR of 2 substrings are maximum. Return the OR string of 2 substrings.

Returned string shouldn’t contain prefix 0s.

Constraints:

  • 4 <= n <= 1e6
  • Each character of string can be 0 and 1
  • For each character of string, 0 and 1 are equally probablistic and randomly chosen.

Observations

Prefix zero are of no use

0001010

Here we take any 2 substrings

0001010
  01010

prefix zero would always be zero in result string and thus would be trimmed.

STEP 1: So remove prefix zero first

Result string length would be same as given string

After trimming prefix zeros, result string length should be same as given string.

Why?

  • It’s not possible to create bigger string with bitwise OR

  • If we create a smaller string as result, given string str itself is bigger than that (i.e., ans = str | str)

STEP 2: s1 = str

Now, bruteforce starts with considering all possible substrings for s2. But, this gets TLE. So, we need to observe more.

Only need to consider prefix substrings

For any substring [i, j], it’s better to consider [0, j]

Always (s1 OR [0, j]) >= (s1 OR [i, j])

Example:

11010111001

Let’s say we consider 1110 substring. Then considering the prefix substring would give us greater OR.

11010111001

11010111001
       1110 (with substring)
-----------
11010111111

11010111001
  110101110 (with prefix substring)
-----------
11110111111

And this is obvious because of the fact that prefix substring would have more bits to OR so more chances of greater OR.

(WARNING: It’s not same as considering suffix [i, n - 1] as it would change the bit position)

So, now we are able to convert our O(n^2) time for calculating all s2 to O(n) time.

Only need to consider prefixes that make leftmost zero 1

All prefixes that doesn’t make leftmost zero 1 (after OR with s1) would have smaller OR than the ones that do that and hence, doesn’t count in result.

Example:

11010111001

11010111001
  110101110
-----------
11110111111

11010111001
   11010111
-----------
11011111111

Clearly the one, that makes leftmost zero one is greater because it has more binary weight.

Thus if leftmost zero is at kth index, we only need consider k prefixes of length [n - 1, n - k]

So, we reduce the substring finding time from O(n) -> O(k)

But why this fact is important? Because it’s given that each character of string is randomly chosen. So, there is very less probablity that this k would big as n (i.e., 1e6) because there are 50% chances of getting a 0 for each character. So, in string k would surely be low like 25, 26 something.

So, for average case: O(k) ~ O(1)

For worst case: O(k) = O(n)

but because we are given that the string is randomly built, we don’t need to consider worst case and consider average case.

Algorithm

With all this observations, algorithm is simple as:

  • str = trim(given_string)
  • s1 = str
  • for all possible prefix substring s2 that makes leftmost 0 in s1 1, candidate = s1 ^ s2
  • return max of all candidates calculated

Code snippet:

#include <algorithm>
#include <iostream>

template <typename T> T read() {
  T t;
  std::cin >> t;
  return t;
}

// precondition: a.len() <= b.len()
std::string string_or(std::string_view a, std::string b) {
  auto const na = a.size();
  auto const nb = b.size();
  std::transform(std::crbegin(a), std::crend(a), std::crbegin(b),
                 std::rbegin(b), [](auto a, auto b) { return a | b; });
  return b;
}

auto trim(std::string str) {
  auto const first_zero = std::find(std::begin(str), std::end(str), '1');
  str.erase(std::begin(str), first_zero);
  return str;
}

std::string solve() {
  auto const _ = read<int>();
  auto const str = trim(read<std::string>());
  if (str.size() == 0)
    return "0";
  auto const first_zero_idx = std::distance(
      std::begin(str), std::find(std::begin(str), std::end(str), '0'));
  std::string_view s = str;
  auto const n = std::size(str);
  auto max = str;
  for (int i = 1; i <= first_zero_idx; ++i) {
    max = std::max(max, string_or(s.substr(0, n - i), str));
  }
  return max;
}

int main() { std::cout << solve() << std::endl; }

Time Complexity

string_or and std::max takes O(n) time.

Loop [1, first_zero_idx] takes O(1) time on average.

Thus time complexity is O(n) on average. (This is what this question expects due to random strings.)

Worst case time complexity is O(n ^ 2).