Skip to content

Array_get and array_has functions#21637

Open
carlos-granados wants to merge 8 commits intophp:masterfrom
carlos-granados:array-get-array-has
Open

Array_get and array_has functions#21637
carlos-granados wants to merge 8 commits intophp:masterfrom
carlos-granados:array-get-array-has

Conversation

@carlos-granados
Copy link
Copy Markdown

This PR proposes adding two small, focused array functions to ext/standard for retrieving and checking nested array elements using dot notation, in a single step.

See the related RFC:
https://wiki.php.net/rfc/array_get_and_array_has


if (dot == NULL) {
/* No dot found, this is a simple key lookup */
zend_string *zkey = zend_string_init(key, key_len, 0);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you have a zend_string already here, then you can avoid the allocation.
In general, use zend_symtable_str_find (also below).


/* If key is null, return the whole array */
if (key == NULL || Z_TYPE_P(key) == IS_NULL) {
ZVAL_ARR(return_value, zend_array_dup(ht));
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Avoid a duplication, just return a copy with the refcount. Sadly, ZVAL_ARR sucks as it doesn't take into account the mutability in the type info; so you'd have to use Z_PARAM_ARRAY so you have a zval that you can use RETURN_COPY on.

result = array_get_nested(ht, Z_STRVAL_P(key), Z_STRLEN_P(key));

if (result != NULL) {
ZVAL_COPY(return_value, result);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RETURN_COPY

result = zend_hash_index_find(ht, Z_LVAL_P(key));

if (result != NULL) {
ZVAL_COPY(return_value, result);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RETURN_COPY


/* Key not found, return default value */
if (default_value != NULL) {
ZVAL_COPY(return_value, default_value);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RETURN_COPY

if (default_value != NULL) {
ZVAL_COPY(return_value, default_value);
} else {
RETVAL_NULL();
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not necessary, the return value is NULL implicitly

}

/* Invalid key type */
RETURN_FALSE;
Copy link
Copy Markdown
Member

@ndossche ndossche Apr 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This case is impossible by using the type int|string, use ZEND_UNREACHABLE();

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not yet impossible, because Z_PARAM_ZVAL, but yes, it should be properly typed in ZPP.

}

/* Recurse into the nested array with the remaining key */
return array_get_nested(Z_ARRVAL_P(current), dot + 1, key_len - segment_len - 1);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless this gets tailcall eliminated, you have primitive recursion here which can overflow. Best to write an explicit loop.

current = zend_symtable_find(ht, segment);
zend_string_release(segment);

if (current == NULL || Z_TYPE_P(current) != IS_ARRAY) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this handle references properly?

@carlos-granados
Copy link
Copy Markdown
Author

@ndossche thanks for your review. Following some comments on the RFC discussion I have updated the implementation to also accept an array in the $key parameter. I also looked into all the other issues that you raised (though some do not apply any longer)

}

/* Handle integer keys (simple lookup) */
ZEND_ASSERT(Z_TYPE_P(key) == IS_LONG);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is insufficient, the ZPP parsing still accepts any argument. The stubs don't enforce the argument type check in any way.

/* Use php_explode to split the string by '.' */
zend_string *delim = ZSTR_CHAR('.');
array_init(&segments_array);
php_explode(delim, Z_STR_P(key), &segments_array, ZEND_LONG_MAX);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bit of an inefficient way to go about it.

result = zend_hash_index_find(ht, Z_LVAL_P(key));

if (result != NULL) {
RETURN_COPY(result);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This won't work properly with references I believe.
The function stub doesn't seem to return a reference, so this needs to be RETURN_COPY_DEREF (same issue is also present in other places).

@carlos-granados
Copy link
Copy Markdown
Author

After some discussions, I have decided to remove the dot notation functionality and just leave the array paths option. This makes the code simpler

num_segments = zend_hash_num_elements(path);

/* Iterate through each segment in the path array */
for (idx = 0; idx < num_segments; idx++) {
Copy link
Copy Markdown
Member

@ndossche ndossche Apr 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The way you wrote this implies something about the design of the function/API (this is important to note for your RFC).

This means that the array must be a list, not just a regular array.
e.g. the following will fail even though intuitively it should work:

<?php
$path = ['hello', 1, 0];
unset($path[1]);
print_r($path);
/* Prints
Array
(
    [0] => hello
    [2] => 0
)
*/

var_dump(array_get([
    'hello' => ['world'],
], $path)); // Prints NULL

Either this should be fixed in the code or the RFC should explicitly spell this out.

One way to fix this is to use ZEND_HASH_FOREACH
EDIT: Looking at your RFC, you should indeed use ZEND_HASH_FOREACH

}

/* Check if the segment exists and is an array for next iteration */
if (current == NULL || Z_TYPE_P(current) != IS_ARRAY) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

References are handled incorrectly:

<?php
$path = ['hello', 0];

$array2 = ['world'];
$array = ['hello' => &$array2];

var_dump(array_get($array, $path)); // Prints NULL, should print 'world'


/* Path not found, return default value */
if (default_value != NULL) {
RETURN_COPY_DEREF(default_value);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this DEREF is necessary, as the function gets its default value passed by value.

/* {{{ Retrieves a value from a deeply nested array using an array path */
PHP_FUNCTION(array_get)
{
zval *array;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where possible, please merge the declaration and assignment. Splitting them is bad practice nowadays because the scope of the variable is unclear.

/* {{{ Helper function to get a nested value from array using an array of path segments */
static zval* array_get_nested(HashTable *ht, HashTable *path)
{
zval *segment_val;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where possible, please merge the declaration and assignment. Splitting them is bad practice nowadays because the scope of the variable is unclear.

}

/* If this is the last segment, return the result */
if (idx == num_segments - 1) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this check can be avoided by initializing current to NULL and returning current at the end of the function

}

/* Segment must be a string or int */
if (Z_TYPE_P(segment_val) == IS_STRING) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh and references are also not handled for segment_val. You can use zend_hash_index_find_deref, or if you switch to a ZEND_HASH_FOREACH loop use something like ZVAL_DEREF.

@carlos-granados
Copy link
Copy Markdown
Author

@ndossche many thanks for your insightful comments, hopefully everything is OK now

Copy link
Copy Markdown
Member

@ndossche ndossche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't re-test, but my concerns are resolved and it looks fine from a technical PoV.
I'll leave the final review to someone else for after the RFC has been voted on; when it has you also need to add an entry to the UPGRADING file.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants