今天接到一个临时任务,统计出团队项目Gitlab项目的代码提交量,按项目,按分支,按人!
人工统计耗时耗力,还是选择使用脚本去完成这项工作!
一、Python脚本实现:
1、需要先安装一些依赖:
pip3 install python-gitlab pip3 install pandas pip3 install re
2、代码如下:
import gitlab
import pandas as pd
import re
gl = gitlab.Gitlab('http://gitlab.cosmoplat.com/', private_token='LuLNxxxxxxxxxxgaVc', timeout=60, api_version='4')
start_time = '2020-10-1T00:00:00Z'
end_time = '2023-07-1T23:00:00Z'
# 如果项目量很大,处理时候就可以通过一个List过滤一下
targetList = ['haier-iot', 'iot-starters'']
def find_word_in(msg):
for word in targetList:
if bool(re.search(word, msg, re.IGNORECASE)):
return True
return False
def get_gitlab():
"""
gitlab API
"""
list2 = []
projects = gl.projects.list(owned=True, all=True)
num = 0
for project in projects:
flag = find_word_in(project.name);
if (flag != True):
continue
num += 1
for branch in project.branches.list(iterator=True):
commits = project.commits.list(all=True, query_parameters={'since': start_time, 'until': end_time, 'ref_name': branch.name})
for commit in commits:
# com = project.commits.get(commit.id)
pro = {}
try:
com = project.commits.get(commit.id)
pro["projectName"] = project.path_with_namespace
pro["authorName"] = com.author_name
pro["branch"] = branch.name
pro["additions"] = com.stats["additions"]
pro["deletions"] = com.stats["deletions"]
pro["commitNum"] = com.stats["total"]
list2.append(pro)
print(pro)
except:
# 将异常捕获,但是程序继续运行下去
print("有错误, 请检查")
continue
return list2
def data():
"""
数据去重
key split
"""
ret = {}
for ele in get_gitlab():
key = ele["projectName"] + ele["authorName"] + ele["branch"]
if key not in ret:
ret[key] = ele
ret[key]["commitTotal"] = 1
else:
ret[key]["additions"] += ele["additions"]
ret[key]["deletions"] += ele["deletions"]
ret[key]["commitNum"] += ele["commitNum"]
ret[key]["commitTotal"] += 1
list1 = []
for key, v in ret.items():
v["项目名"] = v.pop("projectName")
v["开发者"] = v.pop("authorName")
v["分支"] = v.pop("branch")
v["添加代码行数"] = v.pop("additions")
v["删除代码行数"] = v.pop("deletions")
v["提交总行数"] = v.pop("commitNum")
v["提交次数"] = v["commitTotal"]
list1.append(v)
print(list1)
return list1
def csv(csvName):
"""
csv
"""
df = pd.DataFrame(data(), columns=["项目名", "开发者", "分支", "添加代码行数", "删除代码行数", "提交总行数", "提交次数"])
df.to_csv(csvName, index=False, encoding="utf_8_sig")
if __name__ == "__main__":
csv("./gitlab.csv")
二、Shell脚本实现
1、先写了个简单的,只处理单个项目的:
#!/bin/bash
repo_url="http://gitlab.cosmoplat.com/iotplat/haier-tb.git"
repo_name="haier-tb"
repo_path="/jiguiquan/gitStat/haier-tb"
output_file="/jiguiquan/gitStat/commit_counts.csv"
# 写入CSV文件表头
echo "项目名,分支,作者,新增行数,删除行数,改变行数" > $output_file
# 克隆仓库到本地
git clone $repo_url $repo_path
cd $repo_path
# 获取所有分支
branches=$(git branch -a | grep remotes/origin | grep -v HEAD | sed 's/^\s*//g' | sed 's/remotes\/origin\///g')
# 统计代码提交量
for branch in $branches
do
# 切换到分支
git checkout $branch
# 最重要的代码,通过 echo 和 awk 向$output_file文件中追加行数据
git log --format='%aN' | sort -u | while read name; do echo -en "$repo_name,\t $branch,\t" >> $output_file; echo -en "$name,\t" >> $output_file; git log --author="$name" --pretty=tformat: --numstat | awk '{add += $1; subs += $2; loc += $1 + $2 } END { printf "%s, %s, %s\n", add, subs, loc }' >> $output_file; done
done
# 清理临时文件
cd ..
rm -rf $repo_path
2、后来又对该脚本做了一次升级封装,使其支持多个项目一次统计:
#!/bin/bash
output_file="/jiguiquan/gitStat/commit_counts.csv"
# 函数:统计代码提交量
function calculate_commit_counts() {
local repo_url=$1
local repo_name=$2
local repo_path="./$repo_name"
echo $repo_path
# 克隆仓库到本地
git clone $repo_url $repo_path
cd $repo_path
# 获取所有分支
branches=$(git branch -a | grep remotes/origin | grep -v HEAD | sed 's/^\s*//g' | sed 's/remotes\/origin\///g')
echo $branches
# 统计代码提交量
for branch in $branches
do
# 切换到分支
git checkout $branch
# 最重要的代码,通过 echo 和 awk 向$output_file文件中追加行数据
git log --format='%aN' | sort -u | while read name; do echo -en "$repo_name,\t" >> $output_file; echo -en "$branch,\t" >> $output_file; echo -en "$name,\t" >> $output_file; git log --author="$name" --pretty=tformat: --numstat | awk '{add += $1; subs += $2; loc += $1 + $2 } END { printf "%s, %s, %s\n", add, subs, loc }' >> $output_file; done
done
# 清理临时文件
cd ..
rm -rf $repo_path
}
# 写入CSV文件表头
echo "项目名,分支,作者,新增行数,删除行数,改变行数" > $output_file
# 调用函数处理每个Git项目
calculate_commit_counts "http://gitlab.cosmoplat.com/iotplat/haier-rule-engine.git" "haier-rule-engine"
calculate_commit_counts "http://gitlab.cosmoplat.com/data-space/data-space.git" "data-space"
三、python脚本和Shell脚本的测试结果是相同的
效果如下:




