最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

bigdata - How to Build a Type-Safe Data Processing Pipeline in TypeScript? - Stack Overflow

programmeradmin1浏览0评论

big data apps often involve complex data transformations - filtering, mapping, aggregating, etc. Ensuring type safety in these dynamic pipelines can be tricky.

How can a flexible, type-safe function be created, that processes data through a sequence of transformation steps, ensuring that each step gets the correctly typed input from the previous one?

Here's an example of a transformation pipeline where each function modifies the data step by step:

type Transformation<T, U> = (input: T) => U;


// todo: this must be improvable 
function createPipeline<T, Steps extends [...Transformation<any, any>[]]>(
  initialData: T,
  ...steps: Steps
): ReturnType<Steps[number]> {
  return steps.reduce((data, step) => step(data), initialData);
}

// Example usage:
const parseNumbers = (data: string[]) => data.map(Number);
const filterValidNumbers = (data: number[]) => data.filter(n => !isNaN(n));
const sumNumbers = (data: number[]) => data.reduce((sum, n) => sum + n, 0);

const result = createPipeline(['1', '2', 'three'], parseNumbers, filterValidNumbers, sumNumbers);
// Expected output: 3

TS doesn’t enforce that the output type of one step matches the input type of the next, leading to potential runtime errors.

With newish features like variadic tuple types and the satisfies operator, is there a way to make TypeScript statically verify that each transformation is applied in the correct sequence?

big data apps often involve complex data transformations - filtering, mapping, aggregating, etc. Ensuring type safety in these dynamic pipelines can be tricky.

How can a flexible, type-safe function be created, that processes data through a sequence of transformation steps, ensuring that each step gets the correctly typed input from the previous one?

Here's an example of a transformation pipeline where each function modifies the data step by step:

type Transformation<T, U> = (input: T) => U;


// todo: this must be improvable 
function createPipeline<T, Steps extends [...Transformation<any, any>[]]>(
  initialData: T,
  ...steps: Steps
): ReturnType<Steps[number]> {
  return steps.reduce((data, step) => step(data), initialData);
}

// Example usage:
const parseNumbers = (data: string[]) => data.map(Number);
const filterValidNumbers = (data: number[]) => data.filter(n => !isNaN(n));
const sumNumbers = (data: number[]) => data.reduce((sum, n) => sum + n, 0);

const result = createPipeline(['1', '2', 'three'], parseNumbers, filterValidNumbers, sumNumbers);
// Expected output: 3

TS doesn’t enforce that the output type of one step matches the input type of the next, leading to potential runtime errors.

With newish features like variadic tuple types and the satisfies operator, is there a way to make TypeScript statically verify that each transformation is applied in the correct sequence?

Share Improve this question asked Mar 5 at 21:17 Ian CarterIan Carter 2,19813 silver badges26 bronze badges
Add a comment  | 

1 Answer 1

Reset to default 1

You can use a recursive type to check that the input or a return type matches the next function input:

Playground

type Transformation<T, U> = (input: T) => U;

type Steps<T, P extends Transformation<any, any>[], O = P> = 
    P extends [] ? O : P extends [(input: T) => infer R, ...infer B extends Transformation<any, any>[]] ? Steps<R, B, O> : never;
    
type Last<T extends any[]> = T extends [infer A, ...infer B] ? B extends [] ? A : Last<B> : never;

function createPipeline<T, S extends [...Transformation<any, any>[]]>(
  initialData: T,
  ...steps: Steps<T, S>
) {
  return steps.reduce((data, step) => step(data), initialData) as ReturnType<Last<S>>;
}


const parseNumbers = (data: string[]) => data.map(Number);
const filterValidNumbers = (data: number[]) => data.filter(n => !isNaN(n));
const sumNumbers = (data: number[]) => data.reduce((sum, n) => sum + n, 0);

const result = createPipeline(['1', '2', 'three'], parseNumbers, filterValidNumbers, sumNumbers); // number
const result2 = createPipeline(['1', '2', 'three'], filterValidNumbers, parseNumbers, sumNumbers); // error
发布评论

评论列表(0)

  1. 暂无评论