Share: Auto requirement deduplication tool
Recently, as I am learning LLM, I tried to use it in requirement deduplication. Just want to share it here, since the issue data I got from redmine really helps. I think if anyone is interested, similar functions may be added as plugins.
I tested with some data downloaded from redmine project, and got quite good result:- Accuracy > 99.9%
- Recall ~ 80%
- Precision < 50%
And if digging into False Negative and False Positive examples a little bit, I can see that actually the result from the tool may be not "wrong", but just different from what human has already given.
The overall logic is quite simple:
1. for each pair of requirements to compare, concat "Subject" + "Description" as a string
2. calculate embedding for each string, then calculate the cosine similarity;
3. if cosine similarity > 0.5, throw the pair to LLM, and ask it to with a simple prompt:
Please compare the following two requirements, with subject and description, and tell me whether they are very similar and should be duplicated. Please reply with the following format: * Probability: a number between 0% to 100%, showing how much you recommend to set the two tickets to duplicated ** Analysis: Provide your detailed recommendation ** New Requirement: I the probability is > 70%, draft a new requirement to combine the old two requirementsIf you are interested, you may check the folliwng (all in Chinese, but maybe you can use google translate to assist :)
- online tool: https://new-req-dedup.df.r.appspot.com/
- video: https://www.bilibili.com/video/BV1ix4he6EB1/?share_source=copy_web&vd_source=47a132b313187e2b4ee23819d6bea3c8
Doc sharing: - 飞书: https://avb8u30devt.feishu.cn/wiki/Vy9EwKXNZi5D6WkzMZDcyKKUnCe
- 语雀: https://www.yuque.com/lansetianxie/atpb5x/sdknm94vugb4k488
Replies (1)
RE: Share: Auto requirement deduplication tool - Added by Lei Hua 2 months ago
Some code sharing: https://github.com/hualei2016/ReqDeDup
The code is not clean enough, just for reference~