Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(mizrahi): scrape extra info #890

Open
wants to merge 5 commits into
base: master
Choose a base branch
from
Open

Conversation

simonlsk
Copy link

@simonlsk simonlsk commented Nov 22, 2024

Fix #784.

Changes:

  • Extract Cookie from main page request.
  • Fetch extra details request (endpoint /Online/api/OSH/getMaherBerurimSMF) if additionalTransactionInformation is true.
  • Concatenate details in txn.memo in the format field1: value1; field2: value2; ...;

Status and todos:

  • Tests passing locally.
  • Not sure about the returned memo format.
  • Check if there is a cleaner way to extract the cookie and use it in my requests.
  • Not stressed tested on many transactions.

I don't have a lot of experience with puppeteer, would love to get some inputs about the unchecked boxes.

Thanks

@baruchiro
Copy link
Collaborator

Thank you, I hope I will review it soon

@baruchiro
Copy link
Collaborator

Can I list you as a code owner for Mizrahi #830?

Copy link

Pull request has been marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@github-actions github-actions bot added the Stale label Jan 24, 2025
@github-actions github-actions bot removed the Stale label Jan 27, 2025
const PENDING_TRANSACTIONS_PAGE = '/osh/legacy/legacy-Osh-p420';
const PENDING_TRANSACTIONS_IFRAME = 'p420.aspx';
const CHANGE_PASSWORD_URL = /https:\/\/www\.mizrahi-tefahot\.co\.il\/login\/index\.html#\/change-pass/;
const DATE_FORMAT = 'DD/MM/YYYY';
const MAX_ROWS_PER_REQUEST = 10000000000;
const TRANSACTION_DETAILS_REQUEST_CONCURRENCY = 1;
const TRANSACTION_DETAILS_REQUEST_WAIT_TIME = 500; // ms
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can add _MS to the variable name and save the comment.

Suggested change
const TRANSACTION_DETAILS_REQUEST_WAIT_TIME = 500; // ms
const TRANSACTION_DETAILS_REQUEST_WAIT_TIME_MS = 500; // ms

}));

const cookies = await this.page.cookies();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Page-level cookie API is deprecated. Use Browser.cookies or BrowserContext.cookies instead.

Comment on lines +327 to +337
const headersMap: Record<string, any> = {};
const response = await Promise.any(TRANSACTIONS_REQUEST_URLS.map(async (url) => {
const request = await this.page.waitForRequest(url);
const data = createDataFromRequest(request, this.options.startDate);
const headers = createHeadersFromRequest(request);
headersMap[url] = createHeadersFromRequest(request);

return fetchPostWithinPage<ScrapedTransactionsResult>(this.page, url, data, headers);
return fetchPostWithinPage<ScrapedTransactionsResult>(this.page, url, data, headersMap[url]);
}));

const cookies = await this.page.cookies();
const headers = Object.values(headersMap)[0];
headers.Cookie = cookies.map((cookie) => `${cookie.name}=${cookie.value}`).join('; ');
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like this headersMap solution which become another headers object.

But I can't find another way right now.

Comment on lines +205 to +207
const recordsWithDetails = originalRecords
.map((record, index) => ({ record, index }))
.filter(({ record }) => record.MC02ShowDetailsEZ === '1');
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better to filter and then map

return memo;
}

async function getExtraScrap(originalRecords: ScrapedTransaction[], currentTxns: Transaction[], headers: Headers): Promise<Transaction[]> {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rename to addExtraInfo

Comment on lines +209 to +215
const promises = recordsWithDetails.map(({ record }) => getTransactionExtraScrap(record, headers));
let accounts: Array<ExtraTransactionResult | null> = [];
while (promises.length > 0) {
const currentPromises = promises.splice(0, TRANSACTION_DETAILS_REQUEST_CONCURRENCY);
accounts = accounts.concat(await Promise.all(currentPromises));
await sleep(TRANSACTION_DETAILS_REQUEST_WAIT_TIME);
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you're trying to do a batch proccesing here, but you missed something.

Once you run an async function, it runs! When you called getTransactionExtraScrap(record, headers), the function executed. It happened on the function call and not on the await call.

In your code, the promises hold all the executions, and you just waiting them one-by-one, it is not logical.

Comment on lines +217 to +225
const txnsWithExtra = currentTxns.map((txn, i) => {
const extraDetailIndex = recordsWithDetails.findIndex(({ index }) => index === i);
const extraDetails = extraDetailIndex !== -1 ? accounts[extraDetailIndex] : undefined;
const currentTxn = { ...txn };
if (extraDetails) {
currentTxn.memo = simplifyExtraTransactionResultsToMemo(extraDetails);
}
return currentTxn;
});
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess you want to clone the current transaction, what do you think about this?

Suggested change
const txnsWithExtra = currentTxns.map((txn, i) => {
const extraDetailIndex = recordsWithDetails.findIndex(({ index }) => index === i);
const extraDetails = extraDetailIndex !== -1 ? accounts[extraDetailIndex] : undefined;
const currentTxn = { ...txn };
if (extraDetails) {
currentTxn.memo = simplifyExtraTransactionResultsToMemo(extraDetails);
}
return currentTxn;
});
const txnsWithExtra = currentTxns.map(({ ...currentTxn }, i) => {
const extraDetailIndex = recordsWithDetails.findIndex(({ index }) => index === i);
const extraDetails = extraDetailIndex !== -1 ? accounts[extraDetailIndex] : undefined;
if (extraDetails) {
currentTxn.memo = simplifyExtraTransactionResultsToMemo(extraDetails);
}
return currentTxn;
});

Comment on lines +190 to +202
function simplifyExtraTransactionResultsToMemo(extraResult: ExtraTransactionResult): string {
let memo = '';
extraResult.body.fields.forEach(field =>
field?.forEach(group =>
group?.Records.forEach(record =>
record?.Fields.forEach((fieldRecord) => {
memo += `${fieldRecord.Label} ${fieldRecord.Value}; `;
}),
),
),
);
return memo;
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure we want to build a memo by ourself.

Maybe you can add an example of an object and we can learn what are you building here.

Copy link
Collaborator

@baruchiro baruchiro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your contribution!!

@simonlsk
Copy link
Author

simonlsk commented Feb 2, 2025

Thanks for the review, I'll make the fixes as soon as I can.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Mizrahi: Bank Transfers not including extra details
2 participants