We highly welcome and value your contributions to PaddleNLP
. The first step to start your contribution is to sign the PaddlePaddle Contributor License Agreement.
This document explains our workflow and work style:
PaddlePaddle uses the Git branching model. The following steps guide usual contributions.
Our development community has been growing fastly; it doesn't make sense for everyone to write into the official repo. So, please file Pull Requests from your fork. To make a fork, just head over to the GitHub page and click the "Fork" button.
To make a copy of your fork to your local computers, please run
git clone https://github.com/<your-github-account>/PaddleNLP
cd PaddleNLP
For daily works like adding a new feature or fixing a bug, please open your feature branch before coding:
git checkout -b my-cool-feature
Before you start coding, you need to setup the development environment. We highly recommend doing all your development in a virtual environment such as venv or conda. After you setup and activated your virtual environment, run the following command:
make install
This will setup all the dependencies of PaddleNLP
as well as the pre-commit
tool.
If you are working on the examples
or applications
module and require importing from PaddleNLP
, make sure you install PaddleNLP
in editable mode.
If PaddleNLP
is already installed in the virtual environment, remove it with pip uninstall paddlenlp
before reinstalling it in editable mode with
pip install -e .
As you develop your new exciting feature, keep in mind that it should be covered by unit tests. All of our unit tests can be found under the tests
directory.
You can either modify existing unit test to cover the new feature, or create a new test from scratch.
As you finish up the your code, you should make sure the test suite passes. You can run the tests impacted by your changes like this:
pytest tests/<test_to_run>.py
We utilizes pre-commit
(with black, isort and
flake8 under the hood) to check the style of code and documentation in every commit. When you run run git commit
, you will see
something like the following:
➜ (my-virtual-env) git commit -m "commiting my cool feature"
black....................................................................Passed
isort....................................................................Passed
flake8...................................................................Passed
check for merge conflicts................................................Passed
check for broken symlinks............................(no files to check)Skipped
detect private key.......................................................Passed
fix end of files.....................................(no files to check)Skipped
trim trailing whitespace.............................(no files to check)Skipped
CRLF end-lines checker...............................(no files to check)Skipped
CRLF end-lines remover...............................(no files to check)Skipped
No-tabs checker......................................(no files to check)Skipped
Tabs remover.........................................(no files to check)Skipped
copyright_checker........................................................Passed
But most of the time things don't go so smoothly. When your code or documentation doesn't meet the standard, the pre-commit
check will fail.
➜ (my-virtual-env) git commit -m "commiting my cool feature"
black....................................................................Passed
isort....................................................................Failed
- hook id: isort
- files were modified by this hook
Fixing examples/information_extraction/waybill_ie/run_ernie_crf.py
flake8...................................................................Passed
check for merge conflicts................................................Passed
check for broken symlinks............................(no files to check)Skipped
detect private key.......................................................Passed
fix end of files.....................................(no files to check)Skipped
trim trailing whitespace.............................(no files to check)Skipped
CRLF end-lines checker...............................(no files to check)Skipped
CRLF end-lines remover...............................(no files to check)Skipped
No-tabs checker......................................(no files to check)Skipped
Tabs remover.........................................(no files to check)Skipped
copyright_checker........................................................Passed
But don't panic!
Our tooling will fix most of the style errors automatically. Some errors will need to be addressed manually. Fortunately, the error messages are straight forward and
the errors are usually simple to fix. After addressing the errors, you can run git add <files>
and git commit
again, which will trigger pre-commit
again.
Once the pre-commit
checks pass, you are ready to push the code.
[Google][http://google.com/] or StackOverflow are great tools to help you understand the code style errors.
Don't worry if you still can't figure it out. You can commit with git commit -m "style error" --no-verify
and we are happy to help you once you create a Pull Request.
-
Keep pulling
An experienced Git user pulls from the official repo often -- daily or even hourly, so they notice conflicts with others work early, and it's easier to resolve smaller conflicts.
git remote add upstream https://github.com/PaddlePaddle/PaddleNLP git pull upstream develop
-
Push and file a pull request
You can "push" your local work into your forked repo:
git push origin my-cool-stuff
The push allows you to create a pull request, requesting owners of this official repo to pull your change into the official one.
To create a pull request, please follow these steps.
-
Delete local and remote branches
To keep your local workspace and your fork clean, you might want to remove merged branches:
git push origin my-cool-stuff git checkout develop git pull upstream develop git branch -d my-cool-stuff
-
Please feel free to ping your reviewers by sending them the URL of your pull request via IM or email. Please do this after your pull request passes the CI.
-
Please answer reviewers' every comment. If you are to follow the comment, please write "Done"; Otherwise, please start a discussion under the comment.
-
If you don't want your reviewers to get overwhelmed by email notifications, you might reply their comments by in a batch.