-
-
Notifications
You must be signed in to change notification settings - Fork 144
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added logic to fetch README files, documentation, commit messages, an… #2919
base: main
Are you sure you want to change the base?
Added logic to fetch README files, documentation, commit messages, an… #2919
Conversation
🔍 Existing Issues For ReviewYour pull request is modifying functions with the following pre-existing issues: 📄 File: website/management/commands/update_projects.py
Did you find this useful? React with a 👍 or 👎 |
@DonnieBLT I believe for the CodeQL test the languages should be python, javascript; instead of being 'python javascript'. Though I am unsure about how to fix this. |
The reason that the token is there is because it will help with the rate limiting, you can create a token from your github profile settings, page and can you please avoid changes to white space I’m not sure what linter settings you’re using, but maybe if we can standardize them we won’t see the white space changes |
95dc670
to
34add4f
Compare
I've added the GitHub token and fixed the formatting. Please review and let me know. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please check the comments?
) | ||
|
||
# Set Issue Tracker URL | ||
project.issue_tracker_url = f"https://github.com/{repo_name}/issues" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can remove this since it's universal
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Understood.
…lon21/BLT into ai-project-enhancement
I have integrated a basic summary model "facebook/bart-large-cnn". However, due to the large variations in the readme files of the repositories, the summaries aren't too effective. I had thought of pre-processing the content to only pass relevant sections, but even that seems to be difficult since there is no particular structure followed. @DonnieBLT which direction should I look into to improve this? Though openai API is paid, it could do the job really well compared to the generic python models |
We can use the OpenAI we already have a API key and it is set up in the code |
@@ -53,130 +62,7 @@ <h3>Projects: {{ projects.count }}</h3> | |||
{% endfor %} | |||
</ul> | |||
{% endif %} | |||
<ul class="project-list"> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we keep this in this file please
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I assume we don't want to create a separate template and have all the code here itself? Let us finalize what to do with the search function and I'll make the changes accordingly
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes we can combine this into one template and adjust the global search to work as you have it.
website/label_generation.py
Outdated
openai.api_key = os.getenv("OPENAI_API_KEY") | ||
|
||
|
||
def generate_labels(readme_content, github_topics): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you move this to utils.py too please
project.readme_content = readme_content | ||
readme_text = markdown_to_text(readme_content) | ||
project.ai_summary = ai_summary(readme_text, project.topics) | ||
project.ai_labels = json.loads(generate_labels(readme_text, project.topics)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we just add the labels verbatim from the topics?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found the AI-generated labels to be more accurate and effective, but we can surely use these topics directly. I'll modify it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
requesting a few changes, we're almost there
…e url from project model
Have made most of the changes, just need a final heads-up on what to do with the search functionality as it's quite buggy and limited. Should I create a new PR to improve its working if we go that route? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few adjustments request
) | ||
|
||
# Check for Documentation URL (homepage) | ||
project.documentation_url = repo_data.get("homepage") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe we have a homepage field
website/models.py
Outdated
@@ -756,6 +756,10 @@ class Project(models.Model): | |||
closed_issues = models.IntegerField(default=0) | |||
size = models.IntegerField(default=0) | |||
commit_count = models.IntegerField(default=0) | |||
readme_content = models.TextField(null=True, blank=True) | |||
documentation_url = models.URLField(null=True, blank=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use homepage_url
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
@@ -22,6 +22,15 @@ <h3>Projects: {{ projects.count }}</h3> | |||
<i class="fas fa-plus-circle"></i> Add Project | |||
</button> | |||
</form> | |||
<form id="search-form" class="search-form"> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you combine them into the one top search?
@@ -53,130 +62,7 @@ <h3>Projects: {{ projects.count }}</h3> | |||
{% endfor %} | |||
</ul> | |||
{% endif %} | |||
<ul class="project-list"> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes we can combine this into one template and adjust the global search to work as you have it.
I have combined the search bar into one and added all the remaining categories. I'll raise a new pr later to improve the UI of the search results, have kept it basic for now. |
…d issue trackers from repository APIs.
Related issue: #2681
Fixed the migration file issue that I was facing in the previous PR.
I had to remove the line "Authorization": f"token {settings.GITHUB_TOKEN}" from the header as it was giving an error saying 'Unable to fetch repository - 401'. I was unsure how to deal with it so I have removed it for now.