Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FLINK-36209] Remove redundant operations in the initialization of KafkaSourceEnumState #116

Merged
merged 5 commits into from
Sep 19, 2024

Conversation

xiaochen-zhou
Copy link
Contributor

What is the purpose of the change

In certain methods, such as the DynamicKafkaSourceEnumerator#onHandleSubscribedStreamsFetch() method, partitions are divided into assignedPartitions and unassignedInitialPartitions before being passed as parameters to the KafkaSourceEnumState constructor. However, within the constructor, these assignedPartitions and unassignedInitialPartitions are recombined into partitions, leading to unnecessary operations and reduced performance. By optimizing the code to pass partitions directly as a parameter when initializing KafkaSourceEnumState, we can eliminate redundant operations and enhance performance.

Brief change log

Verifying this change

  • This change is a trivial rework / code cleanup without any test coverage.
  • Does this pull request potentially affect one of the following parts:
  • Dependencies (does it add or upgrade a dependency): (no)
  • The public API, i.e., is any changed class annotated with @public(Evolving): (no)
  • The serializers: (no)
  • The runtime per-record code paths (performance sensitive): (no)
  • Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (no)
  • The S3 file system connector: (no)

Documentation

  • Does this pull request introduce a new feature? no)
  • If yes, how is the feature documented? (not applicable)

Copy link

boring-cyborg bot commented Sep 4, 2024

Thanks for opening this pull request! Please check out our contributing guidelines. (https://flink.apache.org/contributing/how-to-contribute.html)

@xiaochen-zhou xiaochen-zhou changed the title Reduce some redundant operations in the initialization of KafkaSourceEnumState [FLINK-34467] Reduce some redundant operations in the initialization of KafkaSourceEnumState Sep 4, 2024
@xiaochen-zhou xiaochen-zhou changed the title [FLINK-34467] Reduce some redundant operations in the initialization of KafkaSourceEnumState [FLINK-36209] Reduce some redundant operations in the initialization of KafkaSourceEnumState Sep 4, 2024
@xiaochen-zhou xiaochen-zhou changed the title [FLINK-36209] Reduce some redundant operations in the initialization of KafkaSourceEnumState [FLINK-36209] Remove redundant operations in the initialization of KafkaSourceEnumState Sep 5, 2024
@xiaochen-zhou
Copy link
Contributor Author

Friendly ping, do you have time to take a look @AHeise 🙏 ?

Copy link
Contributor

@AHeise AHeise left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you very much for your contribution. The change almost LGTM with one little change. I'm running CI.

Comment on lines 114 to 115
private static KafkaSourceEnumState deserializeTopicPartitions(byte[] serializedTopicPartitions,
AssignmentStatus status)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rename to deserializeAssignedPartitions and inline status. I don't see it being used with UNASSIGNED.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rename to deserializeAssignedPartitions and inline status. I don't see it being used with UNASSIGNED.

Thank you for your careful review, I changed to them.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rename to deserializeAssignedPartitions and inline status. I don't see it being used with UNASSIGNED.

done.

@xiaochen-zhou
Copy link
Contributor Author

Friendly ping, passed ci. If you have time, pls help to review it again. thanks a lot. @AHeise

@AHeise AHeise self-assigned this Sep 19, 2024
Copy link
Contributor

@AHeise AHeise left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thank you very much for your contribution. Will merge now.

@AHeise AHeise merged commit 52e7e58 into apache:main Sep 19, 2024
13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants