Skip to content

Fix null pointer dereferences when allocation fails#372

Open
aaranmcguire wants to merge 1 commit intoWizardMac:devfrom
aaranmcguire:fix/null-deref-on-alloc-failure
Open

Fix null pointer dereferences when allocation fails#372
aaranmcguire wants to merge 1 commit intoWizardMac:devfrom
aaranmcguire:fix/null-deref-on-alloc-failure

Conversation

@aaranmcguire
Copy link
Copy Markdown

PR 1: Fix null pointer dereferences when allocation fails

Summary

Six locations across the parser and writer immediately dereference the result
of calloc/malloc without checking whether the allocation succeeded. On
any memory-constrained system, or with a crafted file large enough to exhaust
address space, these become crash bugs.

Every site below is exercised on normal, well-formed files — the allocations
are not guarded by any prior size check that would prevent them from reaching
the failing path.


1. stata/readstat_dta_read.cdta_init_variable

Vulnerable code

static readstat_variable_t *dta_init_variable(dta_ctx_t *ctx, int i,
        int index_after_skipping, readstat_type_t type, size_t max_len) {
    readstat_variable_t *variable = calloc(1, sizeof(readstat_variable_t));

    variable->type = type;   /* ← crashes if calloc returned NULL */

Why it triggers

dta_init_variable is called once for every column in the file
(dta_handle_variables, iterated ctx->var_count times). A DTA file with a
few thousand columns, parsed on a system with fragmented heap or address-space
limits (e.g. a 32-bit process, or a container with ulimit -v), will fail
the calloc and crash on the very next line.

Reproduction

/* Minimal: open any .dta file on a system with < available RAM.
   Simpler lab reproduction with Address Sanitizer: */
setrlimit(RLIMIT_AS, &(struct rlimit){4096*1024, 4096*1024}); // 4 MB limit
readstat_parse_dta(parser, "any.dta", ctx); // → SIGSEGV in dta_init_variable

Fix

-static readstat_variable_t *dta_init_variable(dta_ctx_t *ctx, int i,
-        int index_after_skipping, readstat_type_t type, size_t max_len) {
+static readstat_variable_t *dta_init_variable(dta_ctx_t *ctx, int i,
+        int index_after_skipping, readstat_type_t type, size_t max_len,
+        readstat_error_t *out_retval) {
     readstat_variable_t *variable = calloc(1, sizeof(readstat_variable_t));

+    if (variable == NULL) {
+        if (out_retval) *out_retval = READSTAT_ERROR_MALLOC;
+        return NULL;
+    }
+
     variable->type = type;

Call site updated to pass &retval and check for NULL return before
dereferencing ctx->variables[i].


2. sas/readstat_sas7bdat_read.csas7bdat_parse_subheader_rdc

Vulnerable code

char *buffer = malloc(ctx->row_length);
char *output = buffer;
while (input + 2 <= ...) {
    /* immediately writes to *output — crashes if buffer is NULL */

Why it triggers

ctx->row_length comes from the SAS7BDAT page header and can be up to 64 KB
per row. The RDC decompressor is called for every compressed data page. On
memory-constrained systems, or with a crafted file that declares an extreme
row_length, malloc returns NULL and the loop crashes on the first write.

Reproduction

Craft a SAS7BDAT with page type = RDC compressed + row_length set to
a value that exhausts available memory. Any crash-testing fuzzer (AFL,
libFuzzer) will find this within seconds on a 32-bit target.

Fix

 char *buffer = malloc(ctx->row_length);
+if (buffer == NULL) return READSTAT_ERROR_MALLOC;
 char *output = buffer;

3. sas/readstat_sas7bdat_read.creadstat_parse_sas7bdat

Vulnerable code

sas7bdat_ctx_t    *ctx   = calloc(1, sizeof(sas7bdat_ctx_t));
sas_header_info_t *hinfo = calloc(1, sizeof(sas_header_info_t));

ctx->handle = parser->handlers;   /* ← crashes if either calloc returned NULL */

Why it triggers

These are the very first allocations made when parsing any SAS7BDAT file.
sizeof(sas7bdat_ctx_t) is ~600 bytes. On any system where the allocator
returns NULL (OOM, address-space exhaustion), the crash happens before a
single byte of the file is read.

Fix

 sas7bdat_ctx_t    *ctx   = calloc(1, sizeof(sas7bdat_ctx_t));
 sas_header_info_t *hinfo = calloc(1, sizeof(sas_header_info_t));

+if (ctx == NULL || hinfo == NULL) {
+    retval = READSTAT_ERROR_MALLOC;
+    goto cleanup;
+}
+
 ctx->handle = parser->handlers;

4. readstat_writer.creadstat_begin_writing_data

Vulnerable code

writer->row_len = row_len;
writer->row = malloc(writer->row_len);
if (writer->callbacks.begin_data) {
    retval = writer->callbacks.begin_data(writer);  /* uses writer->row */

Why it triggers

writer->row is used by every readstat_insert_*_value call. If malloc
fails here, writer->row is NULL. The next memset(writer->row, ...) or
any insert call crashes. row_len is the sum of all variable widths — with
many wide string variables it can reach tens of kilobytes.

Fix

 writer->row = malloc(writer->row_len);
+if (writer->row == NULL) { retval = READSTAT_ERROR_MALLOC; goto cleanup; }

5. readstat_writer.creadstat_string_ref_init

Vulnerable code

readstat_string_ref_t *ref = calloc(1, sizeof(readstat_string_ref_t) + len);
ref->first_o = -1;   /* ← crashes if calloc returned NULL */

Why it triggers

Called for every unique long string in a Stata DTA strL dataset. With many
unique long strings, or on a memory-limited system, calloc can fail.

Fix

 readstat_string_ref_t *ref = calloc(1, sizeof(readstat_string_ref_t) + len);
+if (ref == NULL) return NULL;
 ref->first_o = -1;

6. sas/readstat_xport_read.cxport_construct_format

Vulnerable code

char *format = malloc(4 * src_len + 1);
readstat_error_t retval = readstat_convert(format, 4 * src_len + 1, ...);
/* ← readstat_convert writes to format without checking for NULL */

Why it triggers

src_len is sizeof(namestr.nform) — a compile-time constant (8 bytes),
so the allocation is always small. However, the missing null check means any
OOM condition causes readstat_convert to write through a NULL pointer.

Fix

 char *format = malloc(4 * src_len + 1);
+if (format == NULL) return READSTAT_ERROR_MALLOC;
 readstat_error_t retval = readstat_convert(format, 4 * src_len + 1, ...);

Six locations immediately dereference the result of calloc/malloc without
checking whether the allocation succeeded. On memory-constrained systems
or with crafted files, these are reliable crash bugs.

- stata/readstat_dta_read.c: dta_init_variable() — calloc result used
  unconditionally; adds out_retval parameter and null guard
- sas/readstat_sas7bdat_read.c: sas7bdat_parse_subheader_rdc() — malloc
  for RDC decompression buffer unchecked
- sas/readstat_sas7bdat_read.c: readstat_parse_sas7bdat() — both ctx
  and hinfo calloc results used before any null check
- readstat_writer.c: readstat_begin_writing_data() — writer->row malloc
  unchecked; every subsequent insert call would crash
- readstat_writer.c: readstat_string_ref_init() — calloc for strL ref
  unchecked before dereferencing
- sas/readstat_xport_read.c: xport_construct_format() — malloc for
  format conversion buffer unchecked
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant