i18n(ja): standardize SQL type names to English#23145
Conversation
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
There was a problem hiding this comment.
Code Review
This pull request standardizes database type names (such as SMALLINT, BIGINT, float, double, and VARCHAR) across several Japanese documentation files, replacing their Japanese transliterations with standard English technical terms. The review feedback identifies three issues: a duplicate table header row in ticdc-avro-protocol.md, a typo (VARCHAR々) in ticdc-canal-json.md that should be corrected to VARCHAR, and an incorrect uppercase Golang type (FLOAT64) in ticdc-simple-protocol.md that should be changed to lowercase float64 for technical accuracy.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
…lapping files) Cherry-picked commit bdbb6f2 but excluded the 8 files that PR pingcap#23145 (i18n-ja-fix-type-names) handles independently. Kept changes to: data-type-json.md, ticdc/ticdc-debezium.md, tidb-cloud/data-service-app-config-files.md, tidb-cloud/tidb-cloud-console-auditing.md, ai/integrations/vector-search-integrate-with-langchain.md, functions-and-operators/numeric-functions-and-operators.md
ff8ff2a to
1775cf7
Compare
[LGTM Timeline notifier]Timeline:
|
|
hi @yahonda, would you please resolve the conflicts of this PR? Thanks. |
Replace Japanese transliterations/katakana of SQL/programming type identifiers with canonical English names in all documentation files where they appeared as type labels in mapping tables: SQL TYPE column (uppercase, matching EN source) - スモールイント → SMALLINT - ミディアムミント → MEDIUMINT - ビッグイント → BIGINT - フロート → FLOAT - ダブル → DOUBLE - タイニーイント → TINYINT - 十進数 → DECIMAL - チャー / チャール → CHAR - バイナリ → BINARY - 二進法 → VARBINARY - タイニーブロブ → TINYBLOB - ミディアムブロブ → MEDIUMBLOB - ロングブロブ → LONGBLOB - 小さなテキスト / 小さな文字 → TINYTEXT - 中テキスト → MEDIUMTEXT - 長文 → LONGTEXT - ヴァルチャー → VARCHAR - 可変長文字 → VARCHAR - 列挙型 → ENUM - タイムスタンプ → TIMESTAMP - 日付 → DATE - 日時 → DATETIME - 時間 → TIME - 年 → YEAR - 少し → BIT - ブール / ブール値 → BOOL / BOOLEAN - 署名なし / 未署名 / 符号なし → UNSIGNED JAVASCRIPT TYPE column (lowercase, matching EN source) - 番号 → number - 文字列 → string - ヌル → null - 整数 → int - 長さ → long - バイト → bytes PARQUET TYPE column - バイト配列 → BYTE_ARRAY - 固定長バイト配列 → FIXED_LEN_BYTE_ARRAY - タイムスタンプマイクロ → TIMESTAMP_MICROS Also fixed column headers to Japanese where tables were fully replaced (e.g. 'TiDB Cloud Serverlessの型 / JavaScriptの型', 'Parquet プリミティブ型 / Parquet 論理型 / TiDBまたはMySQLの型'). Affected files: ticdc-avro-protocol, ticdc-canal-json, ticdc-csv, ticdc-open-protocol, ticdc-simple-protocol, serverless-driver.md, serverless-export.md, import-parquet-files*.md, bookshop-schema-design.md, unique-serial-number-generation.md, system-variables.md (タイプ: フロート → タイプ: float), tidb-configuration-file.md, tidb-cloud-auditing.md This completes the standardization of SQL/configuration type names across the entire i18n-ja-release-8.5 branch.
- ticdc-avro-protocol: remove duplicate table header row - ticdc-canal-json: fix VARCHARル leftover typo - ticdc-simple-protocol: fix FLOAT64 → float64 (Go convention)
…ma tables Fix 6 schema description tables in dev-guide-bookshop-schema-design.md: - Column header: タイプ → 型 - Field names: restore Japanese translations to English (e.g., タイトル → title, ストック → stock, 名前 → name, etc.) - Type values: restore katakana to canonical SQL types (e.g., ビギント → BIGINT, 小さな整数 → TINYINT, etc.) - Descriptions: kept in Japanese as-is i18n(ja): fix 整数 → int in sequence table i18n(ja): フィールドタイプ → フィールドの型
i18n(ja): fix remaining type names in ticdc-canal-json type tables - First table (MySQL Type mapping): binary, varbinary, text variants, blob variants, date/time types, SET, BIT, TiDBVectorFloat32 - Second table (Integer types): SMALLINT, MEDIUMINT, INT, BIGINT, UNSIGNED variants - Third table (Java SQL Type): INTEGER, REAL, VARCHAR, CLOB, BIT, DATE, TIME, TIMESTAMP, BLOB
i18n(ja): fix remaining integer type names in canal-json - tinyint unsigned → TINYINT UNSIGNED - mediumint unsigned → MEDIUMINT UNSIGNED - 整数 → INT - [128、255] → [128, 255] (Japanese comma → ASCII comma) i18n(ja): fix column type code table in ticdc-open-protocol - Header: タイプ → 型 - ヌル → NULL, タイムスタンプ → TIMESTAMP - 日付 → DATE, 時間 → TIME, 日時 → DATETIME, 年 → YEAR - ブール値 → BOOLEAN, 少し → BIT - 列挙型 → ENUM, セット → SET, 幾何学 → GEOMETRY - 文字/バイナリ → CHAR/BINARY - TiDBベクトルFLOAT32 → TiDBVectorFloat32 - 10月14日 → 10/14 (MT mistakenly translated the code as a date) i18n(ja): fix 少し → Bit in bit flags table header
- Header: mysqlタイプ → mysqlType, and all column headers to EN - All Japanese type names → canonical SQL types (lowercase like EN) - 長さ → long, 弦 → string, バイト → bytes, FLOAT → float, DOUBLE → double - 少し → BIT, ブール → BOOL, 列挙型 → ENUM, etc. - TiDBベクトルFLOAT32 → TiDBVectorFloat32
Fix 3 audit log field tables in tidb-cloud-auditing.md: - Field names restored to EN source (EVENT_CLASS, COST_TIME, etc.) - Type names were already INTEGER/VARCHAR/TIMESTAMP/FLOAT - Descriptions kept in Japanese as-is - Additional CONNECTION and TABLE_ACCESS/GENERAL tables also fixed i18n(ja): 社内使用 → 内部使用 for 'internal use' i18n(ja): fix bit flags table - 価値→Value, name column to English
i18n(ja): Others → その他 (label, not a type name)
- data-type-json.md: JSON value type table (タイプ→型, type values to EN) - data-type-date-and-time.md: zero value date type names to EN - tidb-limitations.md: CHAR/BINARY/VARCHAR/BLOB type names to EN - tidb-cloud/tidb-cloud-console-auditing.md: audit event field and type names to EN - data-type-numeric.md: UNSIGNED/ZEROFILL syntax elements to EN - develop/dev-guide-create-secondary-indexes.md: bookshop schema table (same pattern) i18n(ja): fix programming type names in protocol field tables Change Japanese programming type names to English in protocol field definition tables across 6 files: - 弦 → string - 番号 → number - 物体 → object - ブール値/ブール → boolean (JavaScript/JSON types) - 整数 → integer (config param types) Affected: ticdc-simple-protocol, ticdc-canal-json, ticdc-open-protocol, ticdc-debezium, develop/serverless-driver (config table), tidb-cloud/data-service-app-config-files (config table) i18n(ja): 関数 → function in config table type column i18n(ja): タイプ → 型 in SQL level options table header
i18n(ja): タイプ → 型 in system-variables with English type values - タイプ: ブール値 → 型: Boolean (161) - タイプ: 列挙型 → 型: Enumeration (31) - タイプ: 時間 → 型: Time (6) - タイプ: float → 型: Float (40) - タイプ:期間 → 型: Duration (5) i18n(ja): タイプ: float → 型: Float in tidb-configuration-file i18n(ja): 型: 整数 → 型: Integer in tidb-configuration-file
adc6300 to
f88e990
Compare
Reverted files that had overlapping 非表示→不可視 changes: - releases/release-5.0.0-rc.md - releases/release-8.0.0.md - sql-statement-alter-index.md - sql-statement-create-index.md - best-practices/index-management-best-practices.md Preserved unique PR pingcap#23145 changes (タイプ→型) in system-variables.md
Reverted files that had overlapping 非表示→不可視 changes: - releases/release-5.0.0-rc.md - releases/release-8.0.0.md - sql-statement-alter-index.md - sql-statement-create-index.md - best-practices/index-management-best-practices.md Preserved unique PR pingcap#23145 changes (タイプ→型) in system-variables.md
987bca8 to
e5f1141
Compare
Standardize the N/A notation across the 4 files modified in this PR that used the Japanese '該当なし' for null/empty table cells i18n(ja): 分野 → フィールド名 (fix field mistranslation) 分野 means 'academic field/discipline' - the correct translation for 'database/protocol field' is フィールド名. Fixed 11 table headers across ticdc-canal-json, ticdc-debezium, and ticdc-simple-protocol. Also fixed remaining タイプ/Type → 型 in simple-protocol headers. i18n(ja): fix remaining field name translations and type values - canal-json: sqlタイプ/mysqlタイプ → sqlType/mysqlType (protocol field names) - create-secondary-indexes: 分野の説明 → フィールドの説明 - data-service-app-config-files: 分野 → フィールド名, タイプ → 型 - system-variables: 型: 文字列 → 型: String (26 occurrences) - debezium: ペイロード → payload (protocol field paths) - debezium: ソース.コミット_ts → source.commit_ts - debezium: payload後 → payload.after i18n(ja): fix remaining protocol field paths and type names - debezium: スキーマ名 → schema.name, スキーマ.オプション → schema.optional, スキーマタイプ → schema.type - canal-json: 配列 → Array - simple-protocol: 配列 → Array (2 occurrences) i18n(ja): 非表示のインデックス → 不可視インデックス
ab74dad to
a5cfcd2
Compare
- Remove duplicated SET/RESOURCE_GROUP/CREATE/ADMIN/AS/VEC_COSINE_DISTANCE words that were left over from English sentence structure
What is changed, added or deleted? (Required)
Standardize SQL and programming type names from Japanese transliterations to canonical English forms across 14 files (~213 changes).
All SQL type identifiers in mapping tables are now in English (matching the EN source), while table column headers are kept in Japanese for readability. See the commit message for the full mapping of 30+ type name replacements.
This covers:
Not changed: prose uses of
ダブルクリック(double click),ダブルクォート(double quote),バイト配列(byte array) in running text.Which TiDB version(s) do your changes apply to? (Required)
What is the related PR or file link(s)?
Do your changes match any of the following descriptions?