Skip to content

Conversation

@Tony2h
Copy link
Contributor

@Tony2h Tony2h commented Dec 7, 2025

Description

Issue(s)

  • Close/close/Fix/fix/Resolve/resolve: Issue Link

Checklist

Please check the items in the checklist if applicable.

  • Is the user manual updated?
  • Are the test cases passed and automated?
  • Is there no significant decrease in test coverage?

Copilot AI review requested due to automatic review settings December 7, 2025 00:28
@Tony2h Tony2h changed the base branch from main to 3.0 December 7, 2025 00:36
@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @Tony2h, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request is a merge from the main branch to the 3.0 branch, incorporating a wide array of new features and improvements. Key areas of focus include the integration of a new MQTT library, extensive updates to documentation for new functionalities, significant enhancements to user management and security features, and notable improvements to the query engine's capabilities for advanced analytics and window functions. These changes aim to expand the system's functionality, improve security, and provide more robust data processing options.

Highlights

  • MQTT Library Integration: A new MQTT library (libmqtt) has been integrated into the project. This involved adding a new subdirectory to the CMake build system and migrating existing tmqtt and ttq components into this new contrib/libmqtt structure. Numerous header and source files were renamed and their include paths updated accordingly. This change centralizes MQTT-related functionalities within a dedicated contributed library.
  • Documentation Updates: The documentation has been updated to reflect new features and changes. This includes enhancements to STATE_WINDOW syntax with a zeroth_state parameter, the addition of a FORCE`` option for COMPACTcommands, and comprehensive new sections detailing JSON data writing and instance registration fortaosAdapter. New SQL hints (WIN_OPTIMIZE_BATCH, WIN_OPTIMIZE_SINGLE`) and several new error codes have also been documented.
  • User Management and Security Enhancements: Significant updates have been made to user management, including new fields for password policy (e.g., passwordLifeTime, passwordReuseMax, failedLoginAttempts), TOTP seed, and connection limits (sessionPerUser, connectTime). New whitelist functionalities for IP addresses and datetime ranges have been introduced, along with corresponding API functions and internal handling. Encryption algorithm management has also been added, allowing for the creation, dropping, and listing of encryption algorithms, including built-in options like SM4, AES, SM3, SHA-256, SM2, and RSA.
  • Query Engine Improvements: The query engine has been enhanced with new physical plan nodes and logic for handling advanced window functions and generic analysis. This includes updates to STATE_WINDOW and INTERP clauses, and the introduction of DTW, DTW_PATH, and TLCC functions for time-series correlation analysis. The internal data block serialization/deserialization now supports an 'internal' format, and various operator parameters have been extended to support new query features.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request is a merge from main to the 3.0 branch, incorporating a wide variety of changes. These include documentation updates and typo fixes, bug fixes in the query execution engine, and some refactoring.

My review has focused on the C code changes. I've identified a couple of areas for improvement:

  • An inefficiency in schema modification handling where hash tables are rebuilt repeatedly within a loop.
  • A potential correctness issue in the state_window operator due to the removal of a duplicate timestamp check.

Overall, the changes seem to be moving in the right direction, especially the fixes for state_window with PARTITION BY and the refactoring of the transport cache. Please see my detailed comments for specific suggestions.

Comment on lines +1073 to -1084
pInfo->tsSlotId);
if (NULL == pColInfoData) {
pTaskInfo->code = terrno;
T_LONG_JMP(pTaskInfo->env, terrno);
}
TSKEY* tsList = (TSKEY*)pColInfoData->pData;

struct SColumnDataAgg* pAgg = NULL;
struct SColumnDataAgg* pAgg = (pBlock->pBlockAgg != NULL) ?
&pBlock->pBlockAgg[pInfo->stateCol.slotId] :
NULL;
EStateWinExtendOption extendOption = pInfo->extendOption;
SWindowRowsSup* pRowSup = &pInfo->winSup;

if (pRowSup->groupId != gid) {
/*
group changed, process the previous group's unclosed state window first
*/
doKeepCurStateWindowEndInfo(pRowSup, tsList, 0, &extendOption, false);
int32_t code = processClosedStateWindow(pInfo, pRowSup, pTaskInfo,
pExprSup, numOfOutput);
if (TSDB_CODE_SUCCESS != code) T_LONG_JMP(pTaskInfo->env, code);
*numPartialCalcRows = pRowSup->startRowIndex + pRowSup->numOfRows;

/*
unhandled null rows should be ignored, since they belong to previous group
*/
*numPartialCalcRows += pRowSup->numNullRows;

/*
reset state window info for new group
*/
pInfo->hasKey = false;
resetWindowRowsSup(pRowSup);
}

for (int32_t j = *startIndex; j < *endIndex; ++j) {
if (pBlock->info.scanFlag != PRE_SCAN) {
if (pInfo->winSup.lastTs == INT64_MIN || gid != pRowSup->groupId || !pInfo->hasKey) {
pInfo->winSup.lastTs = tsList[j];
} else {
if (tsList[j] == pInfo->winSup.lastTs) {
// forbid duplicated ts rows
qError("%s:%d duplicated ts found in state window aggregation", __FILE__, __LINE__);
pTaskInfo->code = TSDB_CODE_QRY_WINDOW_DUP_TIMESTAMP;
T_LONG_JMP(pTaskInfo->env, TSDB_CODE_QRY_WINDOW_DUP_TIMESTAMP);
}
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The removal of the duplicate timestamp check is a concern. state_window aggregation is sensitive to row order, and duplicate timestamps can lead to non-deterministic results if their processing order changes between runs. This could be a significant correctness issue. It's recommended to either restore this check or ensure that data fed into this operator has guaranteed unique timestamps.

Comment on lines +983 to +985
SML_CHECK_CODE(smlBuildTempHash(tagHashTmp, *tableMeta, (*tableMeta)->tableInfo.numOfColumns,
(*tableMeta)->tableInfo.numOfColumns + (*tableMeta)->tableInfo.numOfTags));
SML_CHECK_CODE(smlBuildTempHash(colHashTmp, *tableMeta, 0, (*tableMeta)->tableInfo.numOfColumns));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The hash tables tagHashTmp and colHashTmp are rebuilt from scratch in every iteration of this loop by calling smlBuildTempHash. This can be inefficient, especially when processing a large number of schema changes at once, as it repeatedly iterates over the entire schema. A more optimal approach would be to build the hash tables once before the loop, and then rebuild them only after a schema change has been successfully applied via changeMeta.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants