最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

Brute force decompression from only chunk checksums to achieve high compression ratio - Stack Overflow

programmeradmin1浏览0评论

I would like to write code for a thought experiment of brute forcing the decompression of some "compressed data" that consists only of chunk checksums. The reason being that if brute forcing was feasible then much greater compression ratios are possible ie. 100:1.

I am prepared to be roasted for this. I know the basics of hashing and how they're not two-way.

I really just wanted to see if it was even plausible regardless of computation constraints.

I asked Grok for some C code for an algorithm it thought up and the decompressed data never matches the compressed data.

I've looked it over and my best guess is that I'm hitting sha256 collisions but I'm not exactly sure. Maybe my compare is even wrong. C is not my main language.

Here's grok explaining the code:

Here's grok's suggestion for an algorithm I then had ChatGPT write the code for (because I ran out of Grok time):

^ It's the answer that starts with "Alright, let’s take a clean slate and think harder to achieve a 100:1 compression ratio"

Here's the code:

#include <stdio.h>
#include <stdint.h>
#include <stdlib.h>
#include <string.h>
#include <malloc.h>
#include <openssl/evp.h>

#define SHA256_DIGEST_LENGTH 32  // Define missing SHA-256 digest length
#define CHUNK_SIZE 100
#define COMPRESSED_SIZE 1
#define SIZE_MB 1 // Size in MB to allocate
#define ALLOC_CHUNK_SIZE 1024 * 1024
#define ALLOC_SIZE SIZE_MB * ALLOC_CHUNK_SIZE
#define TOTAL_CHUNKS (SIZE_MB * ALLOC_CHUNK_SIZE / CHUNK_SIZE)

// Function to compute SHA-256 hash using OpenSSL EVP API
void sha256_hash(const uint8_t *data, size_t len, uint8_t *out_hash) {
    EVP_MD_CTX *ctx = EVP_MD_CTX_new();
    EVP_DigestInit_ex(ctx, EVP_sha256(), NULL);
    EVP_DigestUpdate(ctx, data, len);
    EVP_DigestFinal_ex(ctx, out_hash, NULL);
    EVP_MD_CTX_free(ctx);
}

// Hash function to generate a seed from a chunk and previous seed
uint8_t generate_seed(uint8_t prev_seed, const uint8_t *chunk) {
    uint8_t hash[SHA256_DIGEST_LENGTH];
    uint8_t input[CHUNK_SIZE + 1];

    input[0] = prev_seed;
    memcpy(input + 1, chunk, CHUNK_SIZE);

    sha256_hash(input, sizeof(input), hash);

    //printf("Generated seed: %02x\n", hash[0]); // Debug output
    return hash[0]; // Use first byte as compressed seed
}

// Expand a seed to 100 bytes using hash stretching
void expand_seed(uint8_t seed, uint8_t *output) {
    uint8_t hash[SHA256_DIGEST_LENGTH];

    sha256_hash(&seed, sizeof(seed), hash);
    memcpy(output, hash, SHA256_DIGEST_LENGTH);

    sha256_hash(hash, SHA256_DIGEST_LENGTH, hash);
    memcpy(output + SHA256_DIGEST_LENGTH, hash, SHA256_DIGEST_LENGTH);

    sha256_hash(hash, SHA256_DIGEST_LENGTH, hash);
    memcpy(output + 2 * SHA256_DIGEST_LENGTH, hash, CHUNK_SIZE - 2 * SHA256_DIGEST_LENGTH);

    //printf("Expanded seed %02x to 100 bytes\n", seed); // Debug output
}

// Compression function
void compress(const uint8_t *input, uint8_t *output) {
    uint8_t prev_seed = 0; // Initial seed
    for (size_t i = 0; i < TOTAL_CHUNKS; i++) {
        output[i] = generate_seed(prev_seed, input + i * CHUNK_SIZE);
        prev_seed = output[i];
        //printf("Compressed chunk %zu: Seed %02x\n", i, output[i]); // Debug output
    }
}

// Decompression function
void decompress(const uint8_t *compressed, uint8_t *output) {
    uint8_t prev_seed = 0;
    for (size_t i = 0; i < TOTAL_CHUNKS; i++) {
        for (uint8_t candidate_seed = 0; candidate_seed < 255; candidate_seed++) {
            expand_seed(candidate_seed, output + i * CHUNK_SIZE);
            if (generate_seed(prev_seed, output + i * CHUNK_SIZE) == compressed[i]) {
                prev_seed = compressed[i];
                //printf("Decompressed chunk %zu: Found seed %02x\n", i, candidate_seed); // Debug output
                break;
            }
        }
    }
}

int main() {
    uint8_t *input_data = malloc(ALLOC_SIZE);
    uint8_t *compressed_data = malloc(TOTAL_CHUNKS);
    uint8_t *decompressed_data = malloc(ALLOC_SIZE);

    // Fill input data with random values
    for (size_t i = 0; i < ALLOC_SIZE; i++) {
        input_data[i] = rand() % 256;
    }
    printf("Actual input data size: %zu bytes\n", malloc_usable_size(input_data));

    printf("Starting compression...\n");
    compress(input_data, compressed_data);
    printf("Compression complete.\n");
    printf("Actual compressed data size: %zu bytes\n", malloc_usable_size(compressed_data));

    printf("Starting decompression...\n");
    decompress(compressed_data, decompressed_data);
    printf("Decompression complete.\n");
    printf("Actual decompressed data size: %zu bytes\n", malloc_usable_size(decompressed_data));

    // Compare decompressed data with original input data
    uint8_t ret = 0;
    if (memcmp(input_data, decompressed_data, ALLOC_SIZE) == 0) {
        printf("Decompression successful: Data matches original input.\n");
    } else {
        printf("Decompression failed: Data does not match original input.\n");
        ret = 1;
    }

    free(input_data);
    free(compressed_data);
    free(decompressed_data);

    return ret;
}
发布评论

评论列表(0)

  1. 暂无评论