Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: CallError redesign #548

Merged
merged 12 commits into from
Jan 16, 2025
Merged

feat: CallError redesign #548

merged 12 commits into from
Jan 16, 2025

Conversation

lwshang
Copy link
Contributor

@lwshang lwshang commented Jan 16, 2025

Description

There are a few reason we need a redesign of the CallError:

  • The previous design made it hard to get the RejectCode which is the most common used field to do error handling.
  • There is an initiative to expose error_code in system API. To avoid breaking changes at that time, we start to include error_code now.

Error types

Aside from the CallError, I introduced SystemError. SystemError is a variant of CallError.

The error type of call_raw() and call_oneway() is SystemError. It implies that the call is rejected by the system.

pub struct SystemError {
    pub reject_code: RejectCode,
    pub reject_message: String,
    pub error_code: ErrorCode,
    pub sync: bool,
}

The error type of call() and call_tuple is CallError. They might also failed when decoding the reply.

pub enum CallError {
    CallRejected(SystemError),
    CandidDecodeFailed(String),
}

Error codes

Borrowed the ErrorCode implementation from pocket-ic.

How Has This Been Tested?

ic-cdk-timers internally make inter-canister calls to itself. The change there demonstrates that the error handling is simplified.

Checklist:

  • The title of this PR complies with Conventional Commits.
  • I have edited the CHANGELOG accordingly.
  • I have made corresponding changes to the documentation.

@lwshang lwshang marked this pull request as ready for review January 16, 2025 15:37
@lwshang lwshang requested a review from a team as a code owner January 16, 2025 15:37
@lwshang lwshang requested a review from mraszyk January 16, 2025 15:40
Comment on lines 301 to 303
/// As of the current version of the IC, the error codes are not available in the system API.
/// Please DO NOT rely on the error codes until they are officially supported.
pub error_code: ErrorCode,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the purpose of having this 'pub' when the value is generally incorrect (from reject_to_error) and the error codes haven't been officially finalized?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can remove the pub for now. But when ic0.msg_error_code is officially supported, we will need to add back the pub. Then that will be a breaking change.

One alternative is to make the fields private and provide public getters. But that will harm the UX when users want to extract multiple fields.

After more thinking, I feel that the alternative approach above could be better. I will followup with Martin.

Co-authored-by: Eric Swanson <[email protected]>
@lwshang lwshang merged commit 68f98ce into next Jan 16, 2025
11 checks passed
@lwshang lwshang deleted the lwshang/CallError branch January 16, 2025 19:15
Comment on lines -45 to -46
// 0 is a special code meaning "no error"
0 => RejectCode::NoError,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would it make sense to assert that the code is not zero?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no need of assertion right? Since there is no variant for 0 now. And the RejectCode instances are constructed using TryFrom<u32> which will panic for 0.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And the RejectCode instances are constructed using TryFrom which will panic for 0.

I missed that panic which is actually dangerous - please see my other comment. Panicking in case of 0 makes sense though since 0 can never become a new reject code.

/// Returns the reject message.
///
/// When the call was rejected asynchronously (IC rejects the call after it was enqueued),
/// this message is set with [`msg_reject`](crate::api::msg_reject).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the reject message can also be produced by the system (canister stopped, out of cycles etc.) or come from crate::api::trap (including a wrapper by the system in case of traps)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a typo.
I wanted to mention

[`msg_reject_msg`](crate::api::msg_reject_msg).

/// this message is set with [`msg_reject`](crate::api::msg_reject).
///
/// When the call was rejected synchronously (`ic0.call_preform` returns non-zero code),
/// this message is set to a fixed string ("failed to enqueue the call").
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd recommend to stick to the current message:

  • for the sake of backward-compatibility (no change in behavior)
  • "failed to enqueue the call" sounds like the queue is the issue, but the failure could also be due to cycles (for instance)

async fn call_raw(self) -> CallResult<Vec<u8>> {
let args_raw = encode_args(self.args).map_err(encoder_error_to_call_error::<T>)?;
async fn call_raw(self) -> SystemResult<Vec<u8>> {
// Candid Encoding can only fail if heap memory is exhausted.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if the heap memory is exhausted, the canister traps anyway

let reject_code = RejectCode::try_from(code).unwrap();
let result = Err(CallRejected {
reject_code,
reject_message: "failed to enqueue the call".to_string(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we define this as a string constant given that it is used at multiple places?

state.write().unwrap().result = Some(match msg_reject_code() {
0 => Ok(msg_arg_data()),
code => {
let reject_code = RejectCode::try_from(code).unwrap();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this unwrap could be dangerous for canisters with an outdated CDK version if the replica introduces a new reject code

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants