最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

How to manage memory for chained array operations in C? - Stack Overflow

programmeradmin2浏览0评论

I'm just starting to program in c, so i don't have much experience with manual memory management. Coming from python, syntax like the following is very pleasant:

import numpy as np

b = np.array([1,2,3,4,5])
a = np.zeros(b.size)

c = np.sum(np.dot(a,b), np.add(a,b))

So I'm trying to recreate the ease of use to make a library akin to numpy, in c. I know this is in no way practical, but its in part to learn c, and for a numerical methods class I'm currently taking. So far, this is what I've come up with:

/*
array type with size and data
*/
typedef struct array
{
    size_t size;
    float data[];
} array;

/*
array constructor
*/
array *array_init(size_t size, float *to_arr)
{
    array *arr = malloc(
        sizeof(array) + sizeof(float) * size
    );

    if (arr==NULL){
        printf("Couldn't allocate memory");
    }

    if (to_arr!=NULL)
    {
        for (size_t i = 0; i < size; i++)
        {
            arr->data[i]=to_arr[i];
        }   
    }
    
    arr->size=size;
    return arr;
}

/*
array constructor
filled with a float n
*/
array *array_like(size_t size, float n)
{
    array *arr = malloc(
        sizeof(array) + sizeof(float) * size
    );

    if (arr==NULL){
        printf("Couldn't allocate memory");
    }

    for (size_t i = 0; i < size; i++)
    {
        arr->data[i]=n;
    }   
    
    arr->size=size;
    return arr;
}


/*
array destructor
*/
void array_del(array *arr)
{
    free(arr);
}

alongside several functions of a similar style

/*
element-wise array multiplication
*/
array *a_mul(array *a, array *b)
{
    array *tmp = array_init(a->size, NULL);

    for (int i = 0; i < a->size; i++)
    {
        tmp->data[i] = a->data[i] * b->data[i];
    }

    return tmp;
    
}

But I've come to realize that when doing something like

array *a = array_init({3,(float[]){1.0,2.0,3.0}});
array *b = array_init({3,(float[]){1.0,2.0,3.0}});

array *result = a_mul(a_mul(amul(a,b),b),a);

Memory is leaked, as the pointers I've allocated are lost after the function goes out of scope. So the question is, how do i go about it? Is there any best practice going forward? Or do I have to accept that c code is just going to be way more tedious to write?

I've tried looking at libraries like Eigen or even something simpler such as linmath, but it's all header files, and i can't seem to find anywhere where something like malloc() is used.

So far, I've thought of freeing each of the inputs of the function after the operation is done, like

/*
element-wise array multiplication
*/
array *a_mul(array *a, array *b)
{
    array *tmp = array_init(a->size, NULL);

    for (int i = 0; i < a->size; i++)
    {
        tmp->data[i] = a->data[i] * b->data[i];
    }
    
    free(a);
    free(b);
    return tmp;
    
}

I've also looked into the concept of an arena, but i do not know if it would be a better choice. I've also checked another post where a dynamic stack is suggested.

So if anyone has had a similar problem, or knows of a memory management pattern that would be better suitable, it would help a lot.

I'm just starting to program in c, so i don't have much experience with manual memory management. Coming from python, syntax like the following is very pleasant:

import numpy as np

b = np.array([1,2,3,4,5])
a = np.zeros(b.size)

c = np.sum(np.dot(a,b), np.add(a,b))

So I'm trying to recreate the ease of use to make a library akin to numpy, in c. I know this is in no way practical, but its in part to learn c, and for a numerical methods class I'm currently taking. So far, this is what I've come up with:

/*
array type with size and data
*/
typedef struct array
{
    size_t size;
    float data[];
} array;

/*
array constructor
*/
array *array_init(size_t size, float *to_arr)
{
    array *arr = malloc(
        sizeof(array) + sizeof(float) * size
    );

    if (arr==NULL){
        printf("Couldn't allocate memory");
    }

    if (to_arr!=NULL)
    {
        for (size_t i = 0; i < size; i++)
        {
            arr->data[i]=to_arr[i];
        }   
    }
    
    arr->size=size;
    return arr;
}

/*
array constructor
filled with a float n
*/
array *array_like(size_t size, float n)
{
    array *arr = malloc(
        sizeof(array) + sizeof(float) * size
    );

    if (arr==NULL){
        printf("Couldn't allocate memory");
    }

    for (size_t i = 0; i < size; i++)
    {
        arr->data[i]=n;
    }   
    
    arr->size=size;
    return arr;
}


/*
array destructor
*/
void array_del(array *arr)
{
    free(arr);
}

alongside several functions of a similar style

/*
element-wise array multiplication
*/
array *a_mul(array *a, array *b)
{
    array *tmp = array_init(a->size, NULL);

    for (int i = 0; i < a->size; i++)
    {
        tmp->data[i] = a->data[i] * b->data[i];
    }

    return tmp;
    
}

But I've come to realize that when doing something like

array *a = array_init({3,(float[]){1.0,2.0,3.0}});
array *b = array_init({3,(float[]){1.0,2.0,3.0}});

array *result = a_mul(a_mul(amul(a,b),b),a);

Memory is leaked, as the pointers I've allocated are lost after the function goes out of scope. So the question is, how do i go about it? Is there any best practice going forward? Or do I have to accept that c code is just going to be way more tedious to write?

I've tried looking at libraries like Eigen or even something simpler such as linmath, but it's all header files, and i can't seem to find anywhere where something like malloc() is used.

So far, I've thought of freeing each of the inputs of the function after the operation is done, like

/*
element-wise array multiplication
*/
array *a_mul(array *a, array *b)
{
    array *tmp = array_init(a->size, NULL);

    for (int i = 0; i < a->size; i++)
    {
        tmp->data[i] = a->data[i] * b->data[i];
    }
    
    free(a);
    free(b);
    return tmp;
    
}

I've also looked into the concept of an arena, but i do not know if it would be a better choice. I've also checked another post where a dynamic stack is suggested.

So if anyone has had a similar problem, or knows of a memory management pattern that would be better suitable, it would help a lot.

Share Improve this question edited Feb 7 at 17:10 Alberto Méndez -Dettmer Muñoz asked Feb 7 at 3:24 Alberto Méndez -Dettmer MuñozAlberto Méndez -Dettmer Muñoz 413 bronze badges New contributor Alberto Méndez -Dettmer Muñoz is a new contributor to this site. Take care in asking for clarification, commenting, and answering. Check out our Code of Conduct. 5
  • 1 'Or do I have to accept that c code is just going to be way more tedious to write'. Well, C doesn't have a garbage collector, so yes, it will be more tedious to write. But that doesn't mean you will need to do complicated thing. In this situation, what you want is your fonction "a_mul" knowing if he need to clean up the argument he receive. You can do a new function "a_mul_free" that will alway free the arg he receive, or add a new parameter. – Tom's Commented Feb 7 at 8:40
  • Be careful : your function doesn't take into account if a and b have different size. Currently, if b is smaller than a, you will do out-of-bound access of b. – Tom's Commented Feb 7 at 8:43
  • Be careful (bis) : Your 'array_init' function doesn't return anything when malloc fail. It will lead to an access ("arr->size=size") while "arr" is NULL, so it will sigsev. If you want to chain your function call, you will have to take into account that maybe your argument are NULL. – Tom's Commented Feb 7 at 8:52
  • Also, understand that (float[]){1.0,2.0,3.0} is a Compound Literal. There is no memory being allocated. It is a temporary object used to initialize to the data member of the struct. The problem is there is no storage allocated for the struct or the data member when initialization takes place. The problem with a_mul() is that no memory was ever allocated for either a or b so calling free() on each will cause bad things to happen. – David C. Rankin Commented Feb 7 at 9:01
  • @DavidC.Rankin sorry, I realized the array_init function was not used in my example, corrected it in the post – Alberto Méndez -Dettmer Muñoz Commented Feb 7 at 18:15
Add a comment  | 

2 Answers 2

Reset to default 3

I like this problem - as a general rule I like to allocate and deallocate memory in the same scope, othervise it becomes hard to determine which piece of code has the responsibility to deallocate.

Looking at your float array I see that you are trying to allocate an entire structure using one malloc, This will not work

/*
array type with size and data
*/
typedef struct array
{
    size_t size;
    float *data; // Change the type to a generic pointer to a float
} array;

/*
array constructor
*/
array array_init(size_t size, float *to_arr)
{

    // Allocate the object on the stack
    array arr = {.size = size}; 

     // Allocate the float array on the heap
    arr.data = malloc(sizeof(arr.data[0]) * arr.size); 

    // this could be done with one call to memcpy
    for (size_t i = 0; i < size; i++)
    {
        arr.data[i]=to_arr[i];
    }   
    
    return arr; 
}

/*
array constructor
filled with a float n
*/
array array_like(size_t size, float n)
{
    // Allocate the object on the stack
    array arr = {.size = size}; 

     // Allocate the float array on the heap
    arr.data = malloc(sizeof(arr.data[0]) * arr.size); 

    // This could be done using memset
    for (size_t i = 0; i < size; i++)
    {
        arr.data[i]=n;
    }   
    
    return arr;
}

These calls now return a structure, which will be returned by value, and the structure in return contains a float array which is passed arround by reference.

When I look at:

/*
element-wise array multiplication
*/
array a_mul(array *a, array *b)
{
    array tmp = array_init(a->size, 0.0);

    for (int i = 0; i < a->size; i++)
    {
        tmp->data[i] = a->data[i] * b->data[i];
    }
    
    return tmp;
    
}

I would change the return value to once again be parsed arround by value, and the call site is responsible for cleaning up the array.

I hope this made some sense:

  • Allocate the structs and the containing array seperatly(when the arrays have arbitrary sizes)
  • Let the calling code 'Own' the lifetime of the allocations, do not call "free" on its arguments.

You are thinking along the correct lines, but you must preserve a pointer to every block of memory allocated in order to free() that memory when done with it. You have identified the problem, you just haven't figured out how to fix it.

In addition to the problem with preserving pointers, you also have the following issues:

  • You cannot use a brace-initializer containing a Compound-Literal to initialize a pointer. array *a is a pointer, not a struct. You are attempting to initialize the pointer address (the value held by a pointer) to 3 and the compound-literal (float[]){1.0,2.0,3.0} is just a meaningless excess-initializer as far as the compiler is concerned.
  • Your float data[] member is a FAM (flexible array member). See C11 Standard - 6.7.2.1 Structure and union specifiers(p18) , but also see limitations on its use in C11 Standard - 6.7.2.1 Structure and union specifiers(p3).
  • You can allocate for the struct + FAM in a single allocation, but you must preserve the resulting pointer.
  • You cannot assign NULL to the data[] member individual elements in a_mul() by calling array *tmp = array_init(a->size, NULL);, just use 0. instead of NULL with array_like() to zero initialize the memory for data[], e.g. array *tmp = array_like (a->size, 0.);,
  • use size_t in your for() loop instead of int to prevent comparison between signed and unsigned values.

Correcting those issues, you can do what you are attempting similar to the following:

#include <stdio.h>
#include <stdlib.h>

/*
 * array type with size and data
 */
typedef struct array
{
    size_t size;
    float data[];
} array;

/*
 * array constructor
 */
array *array_init(size_t size, float *to_arr)
{
    array *arr = malloc (sizeof(array) + sizeof(float) * size);

    if (arr == NULL){
        fputs ("Couldn't allocate memory\n", stderr);
        return NULL;
    }

    for (size_t i = 0; i < size; i++)
    {
        arr->data[i]=to_arr[i];
    }   
    
    arr->size = size;
    
    return arr;
}

/*
 * array constructor
 * filled with a float n
 */
array *array_like(size_t size, float n)
{
    array *arr = malloc (sizeof(array) + sizeof(float) * size);

    if (arr == NULL){
        fputs ("Couldn't allocate memory\n", stderr);
        return NULL;
    }

    for (size_t i = 0; i < size; i++)
    {
        arr->data[i] = n;
    }   
    
    arr->size=size;
    
    return arr;
}

/*
 * array destructor
 */
void array_del(array *arr)
{
    free (arr);
}

/*
 * element-wise array multiplication
 */
array *a_mul(array *a, array *b)
{
    array *tmp = array_like (a->size, 0.);
    
    if (tmp == NULL) {
      perror ("a_mul-malloc");
      return NULL;
    }

    for (size_t i = 0; i < a->size; i++)
    {
        tmp->data[i] = a->data[i] * b->data[i];
    }

    return tmp;
}

void prn_array (array *a)
{
  for (size_t i = 0; i < a->size; i++) {
    printf (i ? ", %4.1f" : "[ %4.1f", a->data[i]);
  }
  puts (" ]");
}

int main (void) {

  array *a = array_init (3, (float[]){1.0,2.0,3.0});
  array *b = array_init (3, (float[]){1.0,2.0,3.0});
  
  /* validate every allocation */
  if (a == NULL || b == NULL) {
    return 1;
  }
  
  /* output initial arrays */
  puts ("a and b:\n");
  prn_array (a);
  prn_array (b);
  
  /* multiply individual results to save pointer to free */
  array *res1   = a_mul (a,b);        /* validation omitted for brevity */
  array *res2   = a_mul (res1, b);    /*   "  */
  array *result = a_mul (res2, a);    /*   "  */
  
  /* output intermediate and final result */
  puts ("\nres1, res2 and result:\n");
  prn_array (res1);
  prn_array (res2);
  prn_array (result);
  
  /* now free the structures and data */
  array_del (a);
  array_del (b);
  array_del (res1);
  array_del (res2);
  array_del (result);
}

note: with single-allocation single-free, just typing free(a) is a lot shorter than array_del (a), but that is completely up to you.

Note in main() each of the pointers are preserved, e.g.

int main (void) {

  array *a = array_init (3, (float[]){1.0,2.0,3.0});
  array *b = array_init (3, (float[]){1.0,2.0,3.0});
  ...
  /* multiply individual results to save pointer to free */
  array *res1   = a_mul (a,b);        /* validation omitted for brevity */
  array *res2   = a_mul (res1, b);    /*   "  */
  array *result = a_mul (res2, a);    /*   "  */
  ...

You must do incremental multiplication, but that's the price you pay to preserve the pointers. (note: technically you could previously declare res1 and res2 and then do assignment within parenthesis in nested calls, e.g. a_mul((res2 = a_mul((res1 = amul(a,b)),b)),a), but that becomes an unreadable error-prone mess quickly, but it can be done...)

Example Use/Output

Compiling the code above and running it will produce the following output:

$ ./array-data-struct
a and b:

[  1.0,  2.0,  3.0 ]
[  1.0,  2.0,  3.0 ]

res1, res2 and result:

[  1.0,  4.0,  9.0 ]
[  1.0,  8.0, 27.0 ]
[  1.0, 16.0, 81.0 ]

Memory Use/Error Check

In any code you write that dynamically allocates memory, you have 2 responsibilities regarding any block of memory allocated: (1) always preserve a pointer to the starting address for the block of memory so, (2) it can be freed when it is no longer needed.

It is imperative that you use a memory error checking program to ensure you do not attempt to access memory or write beyond/outside the bounds of your allocated block, attempt to read or base a conditional jump on an uninitialized value, and finally, to confirm that you free all the memory you have allocated.

For Linux valgrind is the normal choice. There are similar memory checkers for every platform. They are all simple to use, just run your program through it.


$ ./array-data-struct
==70268== Memcheck, a memory error detector
==70268== Copyright (C) 2002-2024, and GNU GPL'd, by Julian Seward et al.
==70268== Using Valgrind-3.24.0 and LibVEX; rerun with -h for copyright info
==70268== Command: ./bin/array-data-struct
==70268==
a and b:

[  1.0,  2.0,  3.0 ]
[  1.0,  2.0,  3.0 ]

res1, res2 and result:

[  1.0,  4.0,  9.0 ]
[  1.0,  8.0, 27.0 ]
[  1.0, 16.0, 81.0 ]
==70268==
==70268== HEAP SUMMARY:
==70268==     in use at exit: 0 bytes in 0 blocks
==70268==   total heap usage: 6 allocs, 6 frees, 1,124 bytes allocated
==70268==
==70268== All heap blocks were freed -- no leaks are possible
==70268==
==70268== For lists of detected and suppressed errors, rerun with: -s
==70268== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

Always confirm that you have freed all memory you have allocated and that there are no memory errors.

Look things over and let me know if you have questions.

发布评论

评论列表(0)

  1. 暂无评论