最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

python - How to count the first N natural numbers in binary? - Stack Overflow

programmeradmin2浏览0评论

This may seem trivial but I haven't found a good solution to the problem. I have even found this: generate all n bit binary numbers in a fastest way possible. but I haven't found an exact duplicate.

The problem is very simple, given a limit N that is a positive integer, generate the binary representation of every natural number up to N in order (excluding N, the first natural number is 0, so N - 1 is the maximum number to be represented), in tuple form, with every tuple padded with leading zeros so that all representations are of the same length.

For example, if N is 4, the output should be [(0, 0), (0, 1), (1, 0), (1, 1)].

At this point this problem is indeed trivial, but there is a catch, no bin(n) and f'{n:b}' and the like are allowed, the algorithm should entirely operate in the binary domain, because as I understand everything (text, photos, music, videos...) in computers are binary numerals all the way down, so converting representations back and forth is adding unnecessary computations, these computations (base-conversion) are completely redundant and should be eliminated to produce the most efficient program (this is about keeping problems specific to as few domains as possible so that we only operate on those domains).

I wrote a simple program that does exactly what I describe:

from typing import Generator, Tuple

def count_in_binary(n: int) -> Generator[Tuple[int, ...], None, None]:
    if not isinstance(n, int) or n < 1:
        raise ValueError("The argument n must be a positive integer")

    l = (n - 1).bit_length() if n > 1 else 1
    numeral = [0] * l
    maxi = l - 1
    for _ in range(n):
        yield tuple(numeral)
        i = maxi
        while True:
            if not (d := numeral[i]):
                numeral[i] = 1
                break
            else:
                numeral[i] = 0
                i -= 1

But I am not sure if this is the most efficient way to do it in Python. I haven't used many bit operations and computers already represent numbers as binary, so there should be faster ways to do this.

The question is, what is a faster way?


For comparison, here is one way that uses f'{n:b}' and therefore is more concise, but is actually much slower and stupider:

def count_in_binary1(n: int) -> Generator[Tuple[int, ...], None, None]:
    if not isinstance(n, int) or n < 1:
        raise ValueError("The argument n must be a positive integer")

    l = len(f'{n-1:b}')
    for i in range(n):
        yield tuple(map(int, f'{i:0{l}b}'))
In [50]: %timeit list(count_in_binary(256))
59.9 μs ± 209 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)

In [51]: %timeit list(count_in_binary1(256))
452 μs ± 3.68 μs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)

Edit

I didn't do much testing with the original function, I just thought it would work, now it's fixed.

And no, the scope of the scope is limited to pure Python, so NumPy isn't allowed.


Edit 2

Now I think there are no exceptions.


I have upvoted and accepted the second answer as it does answer the original question though the original question doesn't include all the relevant information, so the problem I was trying to solve by posting a question remains unsolved.

I have posted a new question without all relevant information, please answer it: Fastest way to find all permutations of 0, 1 of width n without itertools in pure Python?

This may seem trivial but I haven't found a good solution to the problem. I have even found this: generate all n bit binary numbers in a fastest way possible. but I haven't found an exact duplicate.

The problem is very simple, given a limit N that is a positive integer, generate the binary representation of every natural number up to N in order (excluding N, the first natural number is 0, so N - 1 is the maximum number to be represented), in tuple form, with every tuple padded with leading zeros so that all representations are of the same length.

For example, if N is 4, the output should be [(0, 0), (0, 1), (1, 0), (1, 1)].

At this point this problem is indeed trivial, but there is a catch, no bin(n) and f'{n:b}' and the like are allowed, the algorithm should entirely operate in the binary domain, because as I understand everything (text, photos, music, videos...) in computers are binary numerals all the way down, so converting representations back and forth is adding unnecessary computations, these computations (base-conversion) are completely redundant and should be eliminated to produce the most efficient program (this is about keeping problems specific to as few domains as possible so that we only operate on those domains).

I wrote a simple program that does exactly what I describe:

from typing import Generator, Tuple

def count_in_binary(n: int) -> Generator[Tuple[int, ...], None, None]:
    if not isinstance(n, int) or n < 1:
        raise ValueError("The argument n must be a positive integer")

    l = (n - 1).bit_length() if n > 1 else 1
    numeral = [0] * l
    maxi = l - 1
    for _ in range(n):
        yield tuple(numeral)
        i = maxi
        while True:
            if not (d := numeral[i]):
                numeral[i] = 1
                break
            else:
                numeral[i] = 0
                i -= 1

But I am not sure if this is the most efficient way to do it in Python. I haven't used many bit operations and computers already represent numbers as binary, so there should be faster ways to do this.

The question is, what is a faster way?


For comparison, here is one way that uses f'{n:b}' and therefore is more concise, but is actually much slower and stupider:

def count_in_binary1(n: int) -> Generator[Tuple[int, ...], None, None]:
    if not isinstance(n, int) or n < 1:
        raise ValueError("The argument n must be a positive integer")

    l = len(f'{n-1:b}')
    for i in range(n):
        yield tuple(map(int, f'{i:0{l}b}'))
In [50]: %timeit list(count_in_binary(256))
59.9 μs ± 209 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)

In [51]: %timeit list(count_in_binary1(256))
452 μs ± 3.68 μs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)

Edit

I didn't do much testing with the original function, I just thought it would work, now it's fixed.

And no, the scope of the scope is limited to pure Python, so NumPy isn't allowed.


Edit 2

Now I think there are no exceptions.


I have upvoted and accepted the second answer as it does answer the original question though the original question doesn't include all the relevant information, so the problem I was trying to solve by posting a question remains unsolved.

I have posted a new question without all relevant information, please answer it: Fastest way to find all permutations of 0, 1 of width n without itertools in pure Python?

Share Improve this question edited Mar 9 at 10:13 no comment 10.8k5 gold badges20 silver badges43 bronze badges asked Mar 8 at 12:03 Ξένη ΓήινοςΞένη Γήινος 3,7301 gold badge18 silver badges60 bronze badges 14
  • How is this getting a downvote? I fail to understand. – Ξένη Γήινος Commented Mar 8 at 12:21
  • 4 The output seems a pretty inefficient representation for this rather trivial problem. I expect creating tuples to be the rather expensive part in an optimised code (integers should be cached). What do you want to do with this? It looks like a XY problem to me (like what is the fastest way to subtract integers). – Jérôme Richard Commented Mar 8 at 13:15
  • 1 Don't suddenly disallow something that wasn't disallowed before. Especially not when that something is good and has already been used in an answer. – no comment Commented Mar 8 at 13:26
  • 1 It seems the requirement for using only built-ins came after you had gotten answers. – Ted Lyngmo Commented Mar 8 at 15:13
  • 1 How could it be obvious to people who don't know what you know? Radically changing the question after you've gotten answers isn't nice. Realizing that you haven't been clear, you could just have accepted an answer that answers the original question and asked a new question where you make the requirements clear. – Ted Lyngmo Commented Mar 8 at 15:19
 |  Show 9 more comments

2 Answers 2

Reset to default 3

I came out with this solution entirely using built-in functions:

def count_in_binary(n) -> iter[tuple]:
    if not isinstance(n, int) or n < 1:
        raise ValueError('The argument n must be a positive integer.')
    
    o = []
    for t in range((n-1).bit_length()-1, -1, -1):
        block_size = 2**t
        period = [0]*block_size + [1]*block_size
        q, r = divmod(n, 2*block_size)
        o.append( block*q + block[:r] )

    return zip(*o)

and these are the results from the workbench designed by no comment):

From Colab (everything looks stable)

Version      n=50         n=256        n=1000       n=5000      n=7500       n=10000     
------------------------------------------------------------------------------------------
original     7.092        36.695       150.360      836.957      1303.160     1764.156    
no_comment   2.455        11.919        45.721      303.794       482.383      682.284     
Sash_Sinha  40.014       225.721      1006.889     6735.926     10929.236    15004.871   
cards        8.226        32.446       131.652      885.364      1308.538     1930.887       

From my PC

Version      n=50         n=256        n=1000       n=50000      n=7500       n=10000     
------------------------------------------------------------------------------------------
original     19.826       95.492       459.792      2849.338     4399.717     6105.478    
no_comment    8.190       28.310        99.069       873.893     1444.020     2043.485    
Sash_Sinha  145.806      810.397      3568.845     24405.219    37215.727    52120.108   
cards        27.163       77.026       232.923      1683.323     2685.508     4003.884

Note that when n=1 my implementation returns simply a [] because it is not clear how should be represented 0 in a tuple form.

With itertools:

def count_in_binary(n: int) -> Iterator[Tuple[int, ...]]:
    if not isinstance(n, int) or n < 1:
        raise ValueError("The argument n must be a positive integer")

    L = (n-1).bit_length() or 1
    return islice(product((0, 1), repeat=L), n)

Benchmark results and code:

 47 μs  count_in_binary__original
 15 μs  count_in_binary__no_comment
345 μs  count_in_binary__Sash_Sinha
from typing import Generator, Tuple, Iterator
from itertools import *


def count_in_binary__original(n: int) -> Generator[Tuple[int, ...], None, None]:
    if not isinstance(n, int) or n < 1:
        raise ValueError("The argument n must be a positive integer")

    l = (n - 1).bit_length() if n > 1 else 1
    numeral = [0] * l
    maxi = l - 1
    for _ in range(n):
        yield tuple(numeral)
        i = maxi
        while True:
            if not (d := numeral[i]):
                numeral[i] = 1
                break
            else:
                numeral[i] = 0
                i -= 1


def count_in_binary__no_comment(n: int) -> Iterator[Tuple[int, ...]]:
    if not isinstance(n, int) or n < 1:
        raise ValueError("The argument n must be a positive integer")

    L = (n-1).bit_length() or 1
    return islice(product((0, 1), repeat=L), n)


def count_in_binary__Sash_Sinha(n: int) -> Generator[Tuple[int, ...], None, None]:
    """Yield binary tuples for 0 to n-1 with width = bit-length of n-1."""
    if not isinstance(n, int) or n < 1:
        raise ValueError('The argument n must be a positive integer.')
    bit_length = (n - 1).bit_length() or 1
    for i in range(n):
        yield tuple((i >> j) & 1 for j in range(bit_length - 1, -1, -1))


funcs = [
    count_in_binary__original,
    count_in_binary__no_comment,
    count_in_binary__Sash_Sinha,
]

for n in range(1, 100):
    expect = list(count_in_binary__original(n))
    for f in funcs:
        result = list(f(n))
        if result != expect:
            print(n, f.__name__, result)

import timeit
for f in funcs:
    t = min(timeit.repeat(lambda: list(f(256)), number=10**3)) * 1e3
    print(f'{t:3.0f} μs ', f.__name__)

Attempt This Online!

发布评论

评论列表(0)

  1. 暂无评论