Skip to content

Unexpected path truncation of files contained in a tar file by PharData::extractTo() #19311

@cedric-anne

Description

@cedric-anne

Description

The following code:

<?php

$long_path = 'a-very-long-path/to-a-file/with-a-very-long-name/in-a-deep-directory-structure';

if (!is_dir($long_path)) {
    mkdir($long_path, recursive: true);
    touch($long_path . '/a-php-file-with-a-very-long-name.php');
}

foreach (['ustar', 'posix', 'gnu'] as $format) {
    echo sprintf('Format: %s', $format) . PHP_EOL;

    $archive = sprintf('archive-%s.tar', $format);
    $tpm_dir = sys_get_temp_dir() . '/' . $format . '-' . bin2hex(random_bytes(10));

    $command = sprintf('tar --create --file=%s --format=%s a-very-long-path', $archive, $format);
    exec($command);

    $tar = new PharData($archive);
    $tar->extractTo($tpm_dir, overwrite: true);

    $dir = new RecursiveIteratorIterator(new RecursiveDirectoryIterator($tpm_dir));
    foreach ($dir as $file) {
        if (!$file->isFile()) {
            continue;
        }
        echo '> ' . str_replace($tpm_dir . '/', '', $file->getRealPath()) . PHP_EOL;
    }
    echo PHP_EOL;
}

Resulted in this output:

Format: ustar
> a-very-long-path/to-a-file/with-a-very-long-name/in-a-deep-directory-structure/a-php-file-with-a-very-long-name.php

Format: posix
> a-very-long-path/to-a-file/with-a-very-long-name/in-a-deep-directory-structure/a-php-file-with-a-ver

Format: gnu
> a-very-long-path/to-a-file/with-a-very-long-name/in-a-deep-directory-structure/a-php-file-with-a-very-long-name.php

But I expected this output instead:

Format: ustar
> a-very-long-path/to-a-file/with-a-very-long-name/in-a-deep-directory-structure/a-php-file-with-a-very-long-name.php

Format: posix
> a-very-long-path/to-a-file/with-a-very-long-name/in-a-deep-directory-structure/a-php-file-with-a-very-long-name.php

Format: gnu
> a-very-long-path/to-a-file/with-a-very-long-name/in-a-deep-directory-structure/a-php-file-with-a-very-long-name.php

I did not find a clear summary of the different tar formats limitations, but as far as I understand,
As described in https://www.gnu.org/software/tar/manual/html_section/Formats.html, the pax tar format (POSIX 1003.1-2001) should not have this 100 chars limitation on the files path.

When extracted with the tar utility, the paths are not truncated.

PHP Version

PHP 8.4.10 (cli) (built: Jul 22 2025 01:19:35) (NTS)
Copyright (c) The PHP Group
Built by https://github.com/docker-library/php
Zend Engine v4.4.10, Copyright (c) Zend Technologies
    with Zend OPcache v8.4.10, Copyright (c), by Zend Technologies
    with Xdebug v3.4.5, Copyright (c) 2002-2025, by Derick Rethans

Operating System

PHP docker image based on Debian, executed on Ubuntu 24.04

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions