MDEV-38740 Implement JSON as a pluggable data type#5263
Conversation
Starting by intailizeing and loading json type plugin Put basic definition of Field_json
There was a problem hiding this comment.
Code Review
This pull request introduces a pluggable JSON data type plugin (type_json) for MariaDB, migrating the built-in JSON type handling to a modular plugin structure. The changes include adding the plugin implementation (plugin.cc, sql_type_json.cc, sql_type_json.h), updating parser rules in sql_yacc.yy, and adding comprehensive test suites. Feedback on the implementation highlights several critical issues: a potential null pointer dereference in Field_json::report_wrong_value when table is null, an incorrect charset mapping in make_conversion_table_field which should strictly use utf8mb4_bin, and a regression in aggregate_for_comparison that prevents comparisons between JSON and numeric or temporal types. Additionally, minor cleanups are recommended, such as removing commented-out method declarations and fixing a comment typo in plugin.cc.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
| int Field_json::report_wrong_value(const ErrConv &val) | ||
| { | ||
| get_thd()->push_warning_truncated_value_for_field( | ||
| Sql_condition::WARN_LEVEL_WARN, "json", val.ptr(), table->s->db.str, | ||
| table->s->table_name.str, field_name.str); | ||
| reset(); | ||
| return 1; | ||
| } |
There was a problem hiding this comment.
In Field_json::report_wrong_value, the code accesses table->s->db.str and table->s->table_name.str without checking if table is NULL. In MariaDB, Field objects can sometimes have a NULL table pointer (e.g., when used as temporary or conversion fields). Accessing table->s directly in these cases will cause a server crash (segmentation fault).
int Field_json::report_wrong_value(const ErrConv &val)
{
get_thd()->push_warning_truncated_value_for_field(
Sql_condition::WARN_LEVEL_WARN, "json", val.ptr(),
table ? table->s->db.str : "",
table ? table->s->table_name.str : "",
field_name.str);
reset();
return 1;
}| Field *Type_handler_json::make_conversion_table_field( | ||
| MEM_ROOT *root, TABLE *table, uint metadata, const Field *target) const | ||
| { | ||
| /* A JSON field uses 4 bytes for length as json file is a long blob*/ | ||
| uint pack_length= metadata & 0x00ff; | ||
| if (pack_length != 4) | ||
| return NULL; | ||
|
|
||
| return new (root) Field_json(NULL, (uchar *) "", 1, Field::NONE, | ||
| &empty_clex_str, table->s, target->charset()); | ||
| } |
There was a problem hiding this comment.
In Type_handler_json::make_conversion_table_field, the charset of the conversion field is set to target->charset(). However, a Field_json must always use the utf8mb4_bin character set to ensure correct JSON representation and comparison behavior. If the target field has a different character set (e.g., latin1), creating a Field_json with that character set violates this invariant and can lead to encoding issues or assertion failures.
Field *Type_handler_json::make_conversion_table_field(
MEM_ROOT *root, TABLE *table, uint metadata, const Field *target) const
{
/* A JSON field uses 4 bytes for length as json file is a long blob*/
uint pack_length= metadata & 0x00ff;
if (pack_length != 4)
return NULL;
return new (root) Field_json(NULL, (uchar *) "", 1, Field::NONE,
&empty_clex_str, table->s, &my_charset_utf8mb4_bin);
}| // const Type_handler *type_handler_for_comparison() const override; | ||
| // const Type_handler *type_handler_for_tmp_table(const Item *item) const | ||
| // override; | ||
| // bool Item_hybrid_func_fix_attributes(THD *thd, const LEX_CSTRING &func_name, | ||
| // Type_handler_hybrid_field_type *handler, Type_all_attributes *func, | ||
| // Item **items, uint nitems) const override; | ||
|
|
||
| // virtual Item *create_typecast_item(THD *thd, Item *item, | ||
| // const Type_cast_attributes &attr) const override; | ||
|
|
||
| // Item *make_constructor_item(THD *thd, List<Item> *args) const override; |
There was a problem hiding this comment.
| const Type_handler * | ||
| Type_collection_json::aggregate_for_comparison(const Type_handler *a, | ||
| const Type_handler *b) const | ||
| { | ||
| if (a->type_collection() == this) | ||
| swap_variables(const Type_handler *, a, b); | ||
| if (a == &type_handler_json || a == &type_handler_hex_hybrid || | ||
| a == &type_handler_tiny_blob || a == &type_handler_blob || | ||
| a == &type_handler_medium_blob || a == &type_handler_long_blob || | ||
| a == &type_handler_varchar || a == &type_handler_string || | ||
| a == &type_handler_null) | ||
| return b; | ||
| return NULL; | ||
| } |
There was a problem hiding this comment.
In Type_collection_json::aggregate_for_comparison, the aggregation only allows comparison between JSON and other string/blob/null types. Any comparison between JSON and numeric types (like INT, DOUBLE, DECIMAL) or temporal types (like DATETIME, DATE) will return NULL. This prevents implicit comparison between JSON columns and numbers/dates, which is a regression compared to the previous LONGTEXT-based JSON implementation. Consider allowing aggregation with other basic types or delegating to the standard string type collection.
| NULL, // Status variables | ||
| NULL, // System variables | ||
| "1.0", // String version representation | ||
| MariaDB_PLUGIN_MATURITY_GAMMA // Maturity(see include/mysql/plugin.h)*/ |
- Implemented the JSON data type via 'MariaDB_DATA_TYPE_PLUGIN1 API - `SHOW CREATE TABLE` and `DESCRIBE` now explicitly display the column type as `json` instead of `longtext`. - Move json tests to the plugin and split it into smaller focused files inside the plugin test suite. - follow structure xmltype plugin
Implementation Progress
SHOW CREATE TABLEandDESCRIBEnow explicitly display the column type as json instead of longtext.CHARACTER SETandCOLLATEattributes (e.g., JSON CHARACTER SET utf8) without throwing parse errors.Testing
Migrated the built-in JSON tests (mysql-test/main/type_json.test) into the plugin test suite, splitting them into focused, modular files:
TODO
Implement implicit JSON validation through Field_json::store(). (Currently, inserting invalid JSON like 'a' succeeds unless an explicit CHECK constraint is added).