-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fetchRepo runs GitHub API v3 and v4 concurrently #22
base: master
Are you sure you want to change the base?
Conversation
Awesome 😍 |
@@ -59,14 +55,12 @@ optional arguments: | |||
const user = new DbFile(path.join(data.users, file)); | |||
if (!user.ghuser_deleted_because && !user.removed_from_github) { | |||
users.push(user); | |||
spinner.text = `${spinnerText} [${users.length}]`; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was actually still quite useful to me, when I SSH to the machine running the fetchBot. This kind of loops can take several minutes and it's nice to see that the script isn't hanging.
But as you know, I intend to refactor all these big loops everywhere. So I suggest you don't worry about it and I'll push some refactoring in this branch on top of your work, and everything will get merged together at the end. Is it OK for you?
|
||
for (const repoFullName in repoPaths) { | ||
const repo = new DbFile(repoPaths[repoFullName].repo); | ||
full_names = [...referencedRepos]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note for myself: I need to validate this kind of lines on the full DB, because we have more than 100k repos and this might be expensive.
full_names = [...referencedRepos]; | ||
await Promise.all([ | ||
loopDo(full_names, async (full_name) => { await fetchRepo(ghclV3, full_name, firsttime)}), | ||
loopDo(full_names, async (full_name) => { await fetchRepo(ghclV4, full_name, firsttime)}), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll disable the v4-worker for now and as long as the front end isn't ready to deal with an empty repo.organization object. But it still has high-prio for me.
const repo = async (oraSpinner, errCodes, repoFullName, repoFetchedAtDate) => { | ||
`; | ||
// TODO: This sets "organization" field if appropriate, but as an empty object. | ||
const repo = async (errCodes, repoFullName, repoFetchedAtDate) => { | ||
// TODO: Use repoFetchedAtDate |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep that's necessary (and actually the argument name isn't the best). If we don't do conditional requests, we'll consume the API credits 10x faster (maybe even 100x, I wouldn't be surprised).
The argument should be called ifModifiedSince
at this level.
Nice @wowawiwa, thanks, we're getting close to using GraphQL in production :) |
Conflicts: fetchRepos.js impl/github.js
What this PR does:
fetchRepos
to run using both v3 and v4 of GitHub API concurrently.console.log
instead (spinner doesn't work well with concurrent code)Differences v4 driver compared to v3:
webflow
userrepo
query, theorganization
field is just an empty object instead of containing information. When the owner is not an organization, there is noorganization
field at all.