最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

javascript - Multi-threading for zip in nodejs - Stack Overflow

programmeradmin4浏览0评论

Can zip and unzip operation be made-multithreaded in nodejs ?

There are a bunch of modules like yauzl, but neither uses multiple threads, and you can't start multiple threads yourself with node-cluster or something like that, because each zip file must be handled in a single thread

Can zip and unzip operation be made-multithreaded in nodejs ?

There are a bunch of modules like yauzl, but neither uses multiple threads, and you can't start multiple threads yourself with node-cluster or something like that, because each zip file must be handled in a single thread

Share Improve this question asked Nov 15, 2019 at 7:33 AlexAlex 67.7k185 gold badges459 silver badges650 bronze badges 2
  • You basically need a library with a native module which has access to threads. Nodes architecture allows these modules to have access to threads. – Sn0bli Commented Nov 15, 2019 at 7:53
  • In Node v10.5.0 , You can use a --experimental-worker flag for "Multithreading" through worker threads and in Node v11.7.0 they have exposed workers by default and they have removed the flag , nodejs/en/blog/release/v11.7.0 nodejs/en/blog/release/v10.5.0 , You can check the examples medium./@Trott/using-worker-threads-in-node-js-80494136dbb6 – redhatvicky Commented Nov 27, 2019 at 13:19
Add a ment  | 

5 Answers 5

Reset to default 6 +250

According to Zlib documentation

Threadpool Usage: All zlib APIs, except those that are explicitly synchronous, use libuv's threadpool. This can lead to surprising effects in some applications, such as subpar performance (which can be mitigated by adjusting the pool size) and/or unrecoverable and catastrophic memory fragmentation. https://nodejs/api/zlib.html#zlib_threadpool_usage

According to libuv's threadpool you can change the environment variable UV_THREADPOOL_SIZE to change the maximum size

If you instead wish to be pressing many small files at the same time you can use Worker Threads https://nodejs/api/worker_threads.html

On reading your question again it seems like you want multiple files. Use Worker Threads, these will not block your main thread and you can get the output back from them via promises.

Node JS uses Libuv and worker thread . Worker thread is a way to do operation in multi-threaded manner. While by using libuv (it maintains thread in thread pool) you can increase thread of default node js server. You can use both to improve node js performance for your operation.

So here is official documentation for worker thread : https://nodejs/api/worker_threads.html

See how you can increase thread pool in node js here : print libuv threadpool size in node js 8

Can zip and unzip operation be made-multithreaded in nodejs?

Yes.

...and you can't start multiple threads yourself ... because each zip file must be handled in a single thread

I suspect your premise is faulty. Why exactly do you think a node process cannot start multiple threads? Here is an app I'm running which is using the very mature node.js cluster module with a parent process acting as a supervisor and two child processes doing heavily network and disk I/O bound tasks.

As you can see in the C column, each process is running on a separate thread. This lets the master process remain responsive for mand and control tasks (like spawning/reaping workers) while the worker processes are CPU or disk bound. This particular server accepts files from the network, sometimes depresses them, and feeds them through external file processors. IOW, its a task that includes pression like you describe.

I'm not sure you'd want to use worker threads based on this snippet from the docs:

Workers (threads) are useful for performing CPU-intensive JavaScript operations. They will not help much with I/O-intensive work. Node.js’s built-in asynchronous I/O operations are more efficient than Workers can be.

To me, that description screams, "crypo!" In the past I've spawned child processes when having to perform any expensive crypo operations.

In another project I use node's child_process module and kick off a new child process each time I have a batch of files to press. That particular service sees a list of ~400 files with names like process-me-2019.11.DD.MM and concatenates them into a single process-me-2019-11-DD file. It takes a while to press so spawning a new process avoids blocking on the main thread.

Help for how to do multi-threading in node js. You will have to create below three file

index.mjs

import run from './Worker.mjs';

/**
* design your input list of zip files here and send them to `run` one file name at a time
* to zip, using a loop or something. It acts as promise.
* exmaple : run( <your_input> ).then( <your_output> );
**/

Worker.mjs

import { Worker } from 'worker_threads';

function runService(id, options) {
    return new Promise((resolve, reject) => {
        const worker = new Worker('./src/WorkerService.mjs', { workerData: { <your_input> } });
        worker.on('message', res => resolve({ res: res, threadId: worker.threadId }));
        worker.on('error', reject);
        worker.on('exit', code => {
            if (code !== 0)
                reject(new Error(`Worker stopped with exit code ${code}`));
        });
    });
}

async function run(id, options) {
    return await runService(id, options);
}

export default run;

WorkerService.mjs

import { workerData } from 'worker_threads';

// Here goes your logic for zipping a file, where as `workerData` will have <your_input>.

Let me know if it helps.

There is no way you can do multi-threading in pure Nodejs until you use any third-party library. You can execute the process in parallel using promises. If you don't want to overload the main thread which node uses then you can implement RabitMQ (Redis Queue). It will run in its own thread so your main thread will never be blocked.

发布评论

评论列表(0)

  1. 暂无评论