Project

General

Profile

Share: Auto requirement deduplication tool

Added by Lei Hua 2 months ago

Recently, as I am learning LLM, I tried to use it in requirement deduplication. Just want to share it here, since the issue data I got from redmine really helps. I think if anyone is interested, similar functions may be added as plugins.

I tested with some data downloaded from redmine project, and got quite good result:
  • Accuracy > 99.9%
  • Recall ~ 80%
  • Precision < 50%
    And if digging into False Negative and False Positive examples a little bit, I can see that actually the result from the tool may be not "wrong", but just different from what human has already given.

The overall logic is quite simple:
1. for each pair of requirements to compare, concat "Subject" + "Description" as a string
2. calculate embedding for each string, then calculate the cosine similarity;
3. if cosine similarity > 0.5, throw the pair to LLM, and ask it to with a simple prompt:

Please compare the following two requirements, with subject and description, and tell me whether they are very similar and should be duplicated.
Please reply with the following format:
* Probability: a number between 0% to 100%, showing how much you recommend to set the two tickets to duplicated
** Analysis: Provide your detailed recommendation
** New Requirement: I the probability is > 70%, draft a new requirement to combine the old two requirements

If you are interested, you may check the folliwng (all in Chinese, but maybe you can use google translate to assist :)

Replies (1)

RE: Share: Auto requirement deduplication tool - Added by Lei Hua 2 months ago

Some code sharing: https://github.com/hualei2016/ReqDeDup
The code is not clean enough, just for reference~

    (1-1/1)