Fix: add numbered suffix to duplicate filenames instead of overwriting them#37
Open
chicheese wants to merge 4 commits intoJC3:masterfrom
Open
Fix: add numbered suffix to duplicate filenames instead of overwriting them#37chicheese wants to merge 4 commits intoJC3:masterfrom
chicheese wants to merge 4 commits intoJC3:masterfrom
Conversation
Appending a numbered suffix on a file's name if that there are multiple files with the same name.
…ed once instead of looping The duplicate filename handling in buildZIP() only tried to rename a file once when it detected a collision. It would generate a candidate name like file_2.ext using a regex on the original filepath string, but it never checked if that candidate name was already taken in the zip before writing to it. This caused a bug where if 15+ files in the HAR shared the same path and name, only 2 files would end up in the zip: the original (file.ext) and one renamed copy (file_2.ext). Every file after that would just overwrite file_2.ext because the original filepath string never changed between loop iterations, so the regex always came up with the same _2 candidate. The fix was to replace the one-shot rename logic with a while loop that starts at counter 2 and keeps incrementing until it finds a candidate path that doesnt already exist in the zip. The filename and extension are split around the last dot (after the last slash, so dots in directory names dont cause issues). Files with and without extensions both work. Duplicates now each get their own file: file.ext, file_2.ext, file_3.ext, and so on for however many copies are in the HAR.
Merge patch-1 into master
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
If a HAR file has multiple entries that share the same path and filename, the zip was just silently overwriting them and you would end up with only one file instead of all of them.
This adds handling so that when a filename collision is detected, instead of overwriting the existing file it appends a numbered suffix to the new one. So if you have 15 files all named
file.extyou would getfile.ext,file_2.ext,file_3.extand so on for however many there are.The way it works is it checks if the filename already exists in the zip, and if it does it tries
file_2.ext, thenfile_3.ext, and keeps incrementing until it finds a name that isnt taken yet, then writes to that. The filename and extension are split around the last dot after the last slash so that dots in directory names do NOT cause any issues. Files with and without extensions both work fine.Credit to xjcb-de who originally started looking into this problem on their fork in October of 2024.