Today I’m going to tell you more about how to get started working on open source projects to help you merge your first PR on GitHub.

Of course, in addition to the normal PR merge process, I am also going to detail how to solve relatively complex problems such as a PR that encounters conflicts, needs to append commits, needs to merge commits and so on.

 1 
Why get involved in open source projects
In this article, I

do not intend to go into a long article “why participate in open source”, but detail the benefits of participating in open source projects, I want to only “improve coding ability”

Let’s talk about why you should participate in open source projects.

During interviews, I have a habit of asking a candidate a question if he or she says on his resume that he is familiar with a certain language:

Have

you read the source code of an open source project? Or further, have you ever participated in an open source community, or mentioned a PR to an open source project?

If the answer is yes, for example, the candidate says that he has read some of the source code of

Kubernetes modules, and I confirm that he has really read and understood or really submitted a PR of the bugfix/feature type, then I will not ask questions at the programming language level, because I believe that I can understand some of the module source code of a mature open source project or can submit the bugfix/feature type The PR says it all.

When I learned Golang myself, it was roughly divided into two stages:

  1. learning basic syntax, starting to write projects, until I could skillfully complete the development of various business functions;

  2. Looking at the source code of some open source projects, I feel that I have benefited a lot, and the coding level has reached a new level.

Almost when I looked at the Kubernetes project source code, I deeply realized the huge gap between the general internal enterprise project and the open source project that gathers the wisdom of the world’s best programmers, and I also realized the importance of learning the source code of excellent open source projects to improve a programmer’s coding level (of course, you can say that there is also non-open source very good code within Google, There’s no doubt about that, but I don’t think we need to talk about exceptions today).

Read the source code of the open source project

carefully, you will always find some small flaws, at this time to mention a PR (Pull Request), let your code into the open source project, run in “every corner of the world”, how interesting it is! And successfully incorporating the first PR is often like opening Pandora’s box, you will enter another world, start to contact the open source community, and feel the charm of open source!

 2 
Why I want to introduce how to

PR

Our company has open sourced 2 projects, namely

:

CNCF Project DevStream:

https://github.com/devstream-io/devstream

CNCF Project DevStream

Apache DevLake:

https://github.com/apache/incubator-devlake

Apache

The DevLake DevStream project and the

DevLake project will have new contributors submitting PRs every three to five days, but most contributors often encounter one or more problems when submitting the first PR, such as conflicts, excessive commits records or confusion, commit is not signed, commit message is not standardized, various CI process check errors, and so on.

When we see a new contributor submit a PR,

we are naturally very happy and enthusiastic to welcome him and advise on how to fix the issues, but as the number of contributors grows, our open source community needs to answer the question almost every day: “How to properly submit a PR”.

At this point, you may start to wonder if we did not provide the corresponding documentation? In fact, otherwise, we have detailed documentation, but people are always lazy, most new contributors are not willing enough to carefully look through the documentation and then submit a PR, and even many new contributors are relatively unfamiliar with the project structure and document organization structure because they have just started to contact open source projects, and even do not think of the existence of these documents, in short, there are various reasons why most new contributors will choose to “mention the PR first”.

So today I want to try to thoroughly explain “how to submit a PR correctly”, try to detail the whole process of PR on GitHub, and the various difficulties and solutions that may be encountered in it. On the one hand, it hopes to help new people who participate in open source projects for the first time, and on the other hand, it hopes to further lower the barriers to participation in the DevStream community and the DevLake community.

  3

I

want to participate in an open source project, how do I get started?

No matter why you decided to start participating in an open source project, whether the starting point is out of learning, interest, sense of achievement, etc., or to incorporate a certain feature you need into an open source project, in short, today you have made up your mind to submit a PR to an open source project, okay, let’s get started!

Looking for a suitable

open source project

If you have already decided to participate in an open source community, skip this section.

If you just want to get started with open source and don’t know which community to participate in yet, I have a few tips:

    > don’t start with a particularly mature project. For example, to participate in the Kubernetes community now, on the one hand, due to too many contributors, it is difficult to grab an entry-level issue to start the first PR; On the other hand, because there are too many contributors, your voice will be drowned out, and community maintainers do not care about one more or less of you (of course, no one may admit it, but you have to believe it), if you mention a PR that has encountered various problems and cannot solve it independently, then it is likely that your PR will be directly closed over timeout, and no one cares whether you have a good participation experience;

  • Don’t start with particularly small projects. I don’t need to explain this, do you? Very early open source projects may face a lot of problems, such as irregular code, irregular collaboration processes, frequent refactorings and not issue-driven, leaving external participants at a loss….

  • Choose an incubation program from a well-known open source software foundation. On the one hand, this type of project is not particularly mature, so it is friendly to new contributors; On the other hand, it will not be particularly immature and will not give people a poor participation experience, such as the Apache Foundation, Linux Foundation, CNCF, etc.

For example, you can find open source projects that interest you from these places

:

  • CNCF sandbox project: https://www.cncf.io/sandbox-projects/

  • CNCF Incubation Projects (list includes graduation

  • projects): https://www.cncf.io/projects/ Apache projects

  • (incubating projects with Incubating in their names): https://projects.apache.org/projects.html

Of course, you can also directly choose to open the door to the open source world by directly choosing from the CNCF sandbox project DevStream[1] or Apache incubating the project Apache DevLake[2].

There are many ways to find

contribution points

for open source projects, the most typical way is to submit a PR related to feature development or bug fixes, but in fact, the documentation is improved, the test case is improved, the bug feedback, etc. are also very valuable contributions.

However, this article still starts from the contribution points that need to mention PR, taking the DevStream project as an example (other projects are the same), there will be an Issues entry [3] on the homepage of the GitHub code base of the project, where the currently known bugs of the project, proposals (which can be understood as new requirements), documents that are planned to be supplemented, UTs that need to be improved, etc., as shown below:

DevStream Issues

In the issues we can generally find an issue marked by the “good first issue

” tag, click this tab to further filter out all good first issues, which is a relatively simple entry-level issue reserved for new contributors by the community

:

DevStream Good First Issues

That’s right, start here, browse through these good first issues, see if there are any issues that you are interested in and have not yet been assigned, then leave a comment below, wait for the project administrator to assign the task and start coding, like this

:

Claim an Issue in DevStream

As shown in the figure, if an issue has not been claimed, you go up and leave a message, wait for the administrator to assign the task to you, and then you can start development.

  4

I

want to submit a PR, how do I get started?

The root of an open source project codebase will generally have a CONTRIBUTING.md or other similarly named document on how to start contributing, like this:

In

DevStream’s Contributing[4] documentation, we put a Development Workflow[5], which is actually an introduction to the PR workflow, but today, I want to talk about the PR workflow in more detail.

Step 1: The

projects on GitHub have a Fork button in the Fork project repository, we need to fork the open source project to our own account first, take DevStream as an example:

Fork DevStream

click the Fork button, and then go back to your account to find the project you forked:

DevStream Fork

is under your own account, which means you have permission to modify it at will. What we need to do later is to put the code changes into the codebase that we forked, and then merge the commits into the upstream project through pull request.

Step 2: Cloning the project repository to your local

machine is almost the same for any open source project. I wrote some commands directly, you can copy and paste them to execute directly. Of course, some variables in the command still need to be modified according to your own actual needs, for example, for the DevStream project, we can first configure several environment variables like this

:

environment variable:

export WORKING_PATH="~/gocode"
export USER="daniel-hutao"
export PROJECT="devstream"
export ORG="devstream-io"

Similarly for DevLake, the command here becomes like this:

export WORKING_PATH="~/gocode"
export USER="daniel-hutao"
export PROJECT="incubator-devlake"
export ORG="apache"

Remember to change USER to your GitHub username, WORKING_PATH Of course, you can also configure flexibly, write the corresponding path wherever you want to put the code.

Then there are a few lines of generic commands to do clone and so on

:

clone etc.:

mkdir -p ${ WORKING_PATH}
cd${WORKING_PATH}
# You can also use the url: git@github.com:${USER}/${PROJECT}.git
# if your ssh configuration is proper
git clone https://github.com/${USER}/${PROJECT}.git
cd${PROJECT}

git remote add upstream https://github.com/${ORG}/${PROJECT}.git


# Never push to upstream locally
git remote set- url --push upstream no_push

If you configure the ssh mode to clone code, of course, the URL used for the git clone command can be changed to: git@github.com:$ {USER}/${PROJECT}.git。

After this step, the remote message we see locally should look like this:

git remote -v

origin git@github.com:daniel-hutao/devstream.git (fetch)origin git@github.com:daniel-hutao/devstream.git (push)upstream https://github.com/devstream-io/devstream (fetch)

upstream no_push (push)

Remember, your local code changes are always only committed to origin. Then submit a pull request to the upstream via origin.

Step 3: Update the local branch

code

If you just finished forking and clone operations, then your local code must be new. But “just” only exists once, and every time you are ready to start writing code, you need to make sure that the local branch code is new, because developing based on old code will get into endless conflicts.

Update the local main branch code:

git

fetch upstreamgit checkout main git

rebase Upstream/

mainOf course, I don’t recommend you write code directly in the main branch, although your first PR from main is no problem at all, but what if you need to submit 2 PRs at the same time? In short, it is encouraged to add a more readable branch such as feat-xxx or fix-xxx to complete the development work.

Create a branch:


git checkout -b feat-xxx

This way, we have a feature branch feat-xxx that is the same as the upstream main branch code, and we can start writing code happily!

Step 4: There

is nothing to say about writing code, just write, write!

Step 5: Commit and Push

common process:

git add 
git commit -s -m "some description here"
git push origin feat-xxx

Of course, here you need to understand the meaning of these commands and parameters and adjust them flexibly. For example, you can also use git add –all to complete the add step, and you can also add the -f parameter when pushing to force the remote branch to be overwritten (if it already exists, but the commits record is not what you want). But remember to add the -s parameter of git commit!

If you’re used to using an IDE to commit, there’s certainly no problem, like this:

DevStream Commit with Goland

here should pay attention to the commit message specification, may be different requirements for each open source project, such as the DevStream specification [6] is a format similar to:

<type>[optional scope]: [optional body]

[optional footer(s)]

A few examples

:

    feat: some description here

  • docs: some description here

  • fix: some description here

  • fix(core): some description here

  • chore: some description here

  • ……

The commit and push steps can be done in one step in the IDE, or they can be separated, and I am used to operating separately to give myself more leeway. Also, I’m more used to command-line operations

:

git push origin feat-1

Counting objects: 80, done . Delta compression using up to 10 threads.

Compressing objects: 100% (74/74), done.


Writing objects: 100% (80/80), 13.78 KiB | 4.59 MiB/s, done. Total 80 (delta 55), reused 0 (delta 0)

remote: Resolving deltas: 100% (55/55), completed with 31 local objects.

remote: 

remote: Create a pull request for'feat-1' on GitHub by visiting:

remote:      https://github.com/daniel-hutao/devstream/pull/new/feat-1remote: To github.com:daniel-hutao/devstream.git

 * [new branch]      feat-1 -> feat-1

At this point, the local commits are pushed to the remote.

Step 6: Open a PR

After completing the push operation, we open GitHub and can see a yellow prompt box telling us that we can open a pull request:

Compare & pull request

If you don’t see this box, you can also switch directly to the feat-1 branch and click the “Contribute” button below to open a PR, or directly click Pull requests next to Issues to enter the corresponding page.

The Pull Request format defaults like this:

DevStream Pull Request

Here we need to fill in a suitable title (the default is the same as commit message), and then fill in the PR description according to the template. PR templates are different in every open source project, and we need to read the above carefully to avoid making low-level mistakes.

For example, the template of DevStream is currently divided into 4 parts

:

  • pre-checklist: 3 pre-check items are listed here, reminding PR submitters to read the contributing documentation first. Then the code should have perfect comments or documentation, and add test cases as much as possible;

  • Description: Fill in the description information of the PR here, that is, introduce your PR content, you can describe here what problems this PR solves, etc.;

  • Related Issues: Remember? We actually need to claim the issue before we start writing code, and what we need to fill in here is the ID of the corresponding issue, if the issue link you receive is: https://github.com/devstream-io/devstream/issues/796, and this issue is completed after the modification of your PR, it can be closed, at this time you can write under Related Issues ” close #796”;

  • New Behavior: After the code is modified, most cases need to be tested, at this time we can paste the screenshot of the test results here, so that reviewers can know that your code has passed the test and the function is in line with expectations, which can reduce the review workload and quickly merge.

This template is not complicated, we just fill it in.

For example:

class=”rich_pages wxw-img” src=”https://mmbiz.qpic.cn/mmbiz_png/vHicVZXtcAzAzTRVSyX1iaDibku2O7iaF1ndNPusOCkdY9QYrAGKb6JUGNJndB5XDgL0wj6ibv6ssw4AYPPVfp8hYqQ/640?wx_fmt=png”>

DevStream Pull Request Template

Then click “Create pull request” in the lower right corner to complete the creation of a PR. However, I can’t click this button here, the changes I used to demonstrate have no meaning, and cannot be merged into the upstream codebase. But I still want to show you the effect of the PR after it is created, let’s take pr655[7] as an example

:

DevStream Pull Request 655This

is

a PR I proposed last month, which is basically the same as the template format. In addition to the content of the template, you may have noticed that there is an additional Test section here, yes, the template is not dead, the template is just to reduce the cost of communication, you can adjust appropriately, as long as the result is “going in a clearer direction”. I have added a local detailed test result record through the Test section to tell reviewers that I have fully tested locally, please feel free to join.

After submitting the PR, we can

find our own PR in the PR list, at this time, we also need to pay attention to whether the ci check can all pass, if it fails, it needs to be repaired in time. Taking DevStream as an example, the ci check item is roughly as follows:

class=”rich_pages wxw-img” src=”https://mmbiz.qpic.cn/mmbiz_png/vHicVZXtcAzAzTRVSyX1iaDibku2O7iaF1ndUo573pdiafbPCdBhEpvdMP7VfdOtKeVJKvuPG0wXko51lMRGDk2DImA/640?wx_fmt=png”>

DevStream CI ChecksStep

7: PR Merge

If your PR is perfect and uncontroversial, then after too long, the project administrator will merge directly into your PR, and the life cycle of your PR will end.

But, yes, there’s a “but” here, but often the first PR doesn’t go so well, so let’s take a closer look at some of the problems you may often encounter and the corresponding solutions.

  5
I submitted a PR and then ran into other issues

In most cases, after submitting a PR, it will not be merged immediately, reviewers may propose various amendments, or our PR itself has some normative problems, or the ci check is directly reported as an error, how to solve it? Read on.

Reviewers made some changes, how do I update the PR?

Many times, after we submit a PR, we need to continue to add commits, such as finding that there are still some problems with the code after committing and want to change it again, or reviewers have proposed some changes, and we need to update the code.

Generally, we abide by a convention: before the review starts, the update code tries not to introduce new commits records, that is, it can be merged to ensure that the commits records are clear and meaningful; After the review starts, new commits generated by changes to reviewers can be merged without moving forward, which can make the secondary review work more targeted.

However, different

communities have different requirements, and some open source projects will require that a PR contain only one commit, and everyone can flexibly judge according to the actual scenario.

Speaking of how to update the PR, we just need to continue to modify the code locally, and then execute these commands through the same steps as the first commit: git

add
git commit -s -m "some description here"
git push origin feat-xxx

Don’t look at the feat-xxx branch of origin at this time, in fact, GitHub will help you add all the new commits to an unincorporated PR. That’s right, you just keep pushing and the PR will update automatically.

As for how to merge commits, we’ll cover them in detail in the next section.

Too many commits or confusing records, how to merge Commits?

In many cases we need to merge commits, for example, you change 100 lines of code in your first commit, and then find that 1 line is missing, and then submit another commit, then the second commit is too “boring”, we need to merge it.

1, Git command line way to merge Commits

,

for example, I have 2 commits with the same name here, and the second commit actually only changes one punctuation:

Commits To be Merged

, we can use the rebase command to complete the merger of 2 commits:

git rebase -i HEAD~2

Executing this command will enter an editing page, the default is vim editing mode, the content is roughly as follows:

pick 3114c0f docs: just fortestpick 9b7d63b docs: just fortest

# Rebase d640931.. 9b7d63b onto d640931 (2 commands)


#
# Commands:
# p, pick = use commit
# r, reword = use commit, but edit the commit message
# e, edit = use commit, but stop for amending
# s, squash = use commit, but meld into previous commit
# f, fixup = like "squash", but discard this commit's log message
# x, exec = run command (the rest of the line) using shell
# d, drop = remove commit
#
# These lines can be re-ordered; they are executed from top to bottom.
#
# If you remove a line here THAT COMMIT WILL BE LOST.
#
# However, if you remove everything, the rebase will be aborted.

We need to change the second pick to s and save the exit (vim’s wq command):

pick 3114c0f docs: just fortest s 9b7d63b docs: just fortest

then goes to the second edit page:

# This is a combination of 2 commits.
# This is the 1st commit message:docs: just fortestSigned-off-by: Daniel Hu 

# This is the commit message #2:

docs: just fortestSigned-off-by: Daniel Hu 

# Please enter the commit message for your changes. Lines starting


# with '#' will be ignored, and an empty message aborts the commit.
# ...

Here is used to edit the merged commit message, we directly delete the superfluous part, only keep a few lines:

docs: just fortest

Signed-off-by: Daniel Hu 

Then the same viv save exit operation, at this time you can see the log:

[detached HEAD 80f5e57] docs: just fortest Date: Wed Jul 6 10:28:37 2022 +0800 1 file changed, 2 insertions(+)

Successfully rebased and updated refs/heads/feat-1.

At this time, you can check whether the commits record is as expected by the git log command

Rebased

Okay, let’s make sure locally that commits have been merged, at which point we can continue to push to the remote and let the PR be updated:

git push -f origin feat-xxx

needs to have a -f parameter here to force the update, merging commits is essentially a conflict, and the remote old commits record needs to be flushed.

2, IDE merge Commits

graphical mode can of course also achieve Commits merger.

Screenshot up:

class=”rich_pages wxw-img” src=”https://mmbiz.qpic.cn/mmbiz_png/vHicVZXtcAzAzTRVSyX1iaDibku2O7iaF1ndR1vt6Ot3UJmKzjxUTIicPzGcOOZhjI1e9AnlP80kWuft5YSPtcyqyPg/640?wx_fmt=png”>

Squash with Goland

    click

  1. Git in the lower right corner

  2. to

  3. select the commits you want to merge

  4. ,

  5. right-click, and then click Squash Commits, remember to say silently in your mouth: Go you!

Then you can see this page:

Squash with Goland

This is a graphical way to modify the commit message page, OK, change it to the way you like, then click the OK button in the lower right corner, and the matter is over.

Squash with Goland

,

2 commits, they “merged” into a “makeover” new commit.

PR creates a conflict, how to resolve it?

Conflicts can be resolved online or locally, let’s look at them one by one.

1. Online conflict resolutionWe

should avoid conflicts as much as possible and develop the habit of updating local code before writing code every time. However, conflicts cannot be completely avoided, sometimes your PR is blocked for a few days, maybe someone else changed the same line of code, and was merged first, then your PR has a conflict, something like this (again, at the moment I can’t really go to the upstream project to construct conflicts, so the following conflicts for demonstration are in my own repo):

Conflict Happened

makes people’s hearts crow every time they see this page. We click on the “Resolve conflicts” button to see the content of the specific conflict:

Conflict File

can see the specific conflict line, the next thing to do is to resolve the conflict. We need to remove all <<<<<<<, >>>>>>>, and ======= tags and keep only what we want in the end, as follows

:

Conflict

Resolved

then click “Mark as Resolved” in the upper right corner

Mark as resolved

Finally click “Commit merge”:

Commit Merge

thus completes the conflict resolution, and you can see that a new commit:

is generated

Conflict Resolved

here, the conflict is resolved.

2. Local conflict resolution

is more often, we need to resolve conflicts locally, especially when there are too many conflicts and too complicated.

Similarly, we construct a conflict, this time trying to resolve it locally.

Let’s take a look at the content of the conflict online:

class=”rich_pages wxw-img” src=”https://mmbiz.qpic.cn/mmbiz_png/vHicVZXtcAzAzTRVSyX1iaDibku2O7iaF1ndcXa9fzYggOqveWHyT7bAtWt3Oib29UYxibiafTjQS6vKocejib4r4ib2diaw/640?wx_fmt=png”>

Conflict Happened

Then we do it locally:

#

First switch back to

the main branch git checkout main#

Pull upstream code (the actual scene must be in conflict with the upstream, our demo environment here is actually origin)

git fetch upstream

# Update the local main (you can also use rebase here, but reset.) Whether there is a conflict or not, it will always succeed

) git reset --hard upstream/main Here, the local main branch is exactly the same as the remote (or upstream) main branch code,

Then what we have to do is merge the code of the main branch into our own feature branch and resolve the conflict.

git checkout feat-1
git rebase main

At this time, you will see a log like this:

First, rewinding head to replay your work on top of it...
Applying: docs: conflict test 1Using index info to reconstruct a base tree... M       README.mdFalling back to patching base and 3-way merge... Auto-merging README.md

CONFLICT (content): Merge conflict in README.md


error: Failed to merge in the changes.
Patch failed at 0001 docs: conflict test 1
The copy of the patch that failed is found in: .git/rebase-apply/patchResolve all conflicts manually, mark them as resolved with

"git add/rm "then run "git rebase --continue".


You can instead skip this commit: run "git rebase --skip".
To abort and get back to the state before "git rebase", run "git rebase --abort".

We need to resolve conflicts, open the README.md directly, find the conflicting place, and modify it directly. The changes here are no different from the online conflict resolution described in the previous section, so I will not repeat them.

Again, leave only the final content in the code, and then continue with the git command:

Conflict Resolved

may not be at ease at this time, then take a look at the commits history through the git log command:

Commits History

“conflict test

2″ here is my record of committing to the main branch, you can see that this time is a little later than “conflict test 1”, but it merged first. After we rebase the operation, this record comes first, and our feature branch’s “conflict test 1” comes last, which looks harmonious, and we continue to push this change to the remote, this command has appeared many times:

git push - f origin feat-xxx

At this time, if we go back to GitHub to look at the PR, we can find that the conflict has been resolved, and there is no redundant commit record, which means that the commit record of this PR is very clean. It’s as if the conflict never arose:

Commit With DCO Error

Let’s see how to solve it:

git commit

–amend -s

is a simple command that can directly add the signed-off-by message to the most recent commit. After executing this line of command, you will directly enter the commit message editing page, which by default is as follows

:

docs: dco test

Signed-off-by: Daniel Hu

At this time, we can modify the commit message at the same time, if you don’t need it, then save and exit directly, and the signature information will be added automatically.

What about after you finish signing? Of course, there is a forced push:

git push -f origin feat-xxx

In this way, the DCO error in your PR will be fixed naturally.

Related links:

    >

  1. https://www.cncf.io/projects/devstream/ https://projects.apache.org/project.html?incubator-devlake

  2. https://github.com/devstream-io/devstream/issues

  3. https://github.com/devstream-io/devstream/blob/main/CONTRIBUTING.md

  4. https://github.com/devstream-io/devstream/blob/main/docs/development/development-workflow.md

  5. https://github.com/devstream-io/devstream/blob/main/docs/development/commit-messages.md

  6. https://github.com/devstream-io/devstream/pull/655

  7. https://wiki.linuxfoundation.org/dco

Buy Me A Coffee