Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

getMediaTypeParams() doesn't handle quoted-string values #245

Open
bkdotcom opened this issue Aug 21, 2024 · 0 comments
Open

getMediaTypeParams() doesn't handle quoted-string values #245

bkdotcom opened this issue Aug 21, 2024 · 0 comments

Comments

@bkdotcom
Copy link

bkdotcom commented Aug 21, 2024

https://www.rfc-editor.org/rfc/rfc7231#section-3.1.1.1

media-type = type "/" subtype *( OWS ";" OWS parameter )
type       = token
subtype    = token

The type/subtype MAY be followed by parameters in the form of name=value pairs.

parameter      = token "=" ( token / quoted-string )

A parameter value that matches the token production can be transmitted either as a token or within a quoted-string. The quoted and unquoted values are equivalent. For example, the following examples are all equivalent, but the first is preferred for consistency:

text/html;charset=utf-8
text/html;charset=UTF-8
Text/HTML;Charset="utf-8"
text/html; charset="utf-8"

Contrived example

application/json;charSet="UTF-8"; FOO = "b; a\\"r"
(there shouldn't be whitespace around the "=", but we can handle it)

expected:

array(
  'charset' => 'UTF-8',  // charset value (considered case-insensitive) in particular should probably be strtolower'd 
  'foo' => 'b; a"r',
)

actual
Undefined array key 1

array(
  'charset' => '"UTF-8"',
  'foo·' => ' "b',
  'a\"r"' => null,

something like this

public function getMediaTypeParams(): array
{
    $contentType = $this->getContentType();

    if ($contentType === null) {
        return array();
    }

    $paramString = \preg_replace('/^.*?[;,]\s*/', '', $contentType);
    $regexToken = '[^\\s";,]+';
    $regexQuotedString = '"(?:\\\\"|[^"])*"';   // \" or not "
    $regex = '/
        (?P<key>' . $regexToken . ')
        \s*=\s*    # standard does not allow whitespace around =
        (?P<value>' . $regexQuotedString . '|' . $regexToken . ')
        /x';

    \preg_match_all($regex, $paramString, $matches, PREG_SET_ORDER);

    $params = array();
    foreach ($matches as $kvp) {
        $key = \strtolower($kvp['key']);
        $value = \stripslashes(\trim($kvp['value'], '"'));
        $params[$key] = $value;
    }
    return $params;
}

fix could go futher and strtolower the value if key is charset


related

@bkdotcom bkdotcom changed the title getMediaTypeParams doesn't handle quoted-string values getMediaTypeParams() doesn't handle quoted-string values or whitespace Aug 21, 2024
@bkdotcom bkdotcom changed the title getMediaTypeParams() doesn't handle quoted-string values or whitespace getMediaTypeParams() doesn't handle quoted-string values Aug 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant