Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,6 @@
import java.util.ArrayList;
import java.util.Arrays;
import java.util.Date;
import java.util.LinkedList;
import java.util.List;
import java.util.NoSuchElementException;
import java.util.Objects;
Expand Down Expand Up @@ -217,7 +216,6 @@ public enum NameSource {
}

static final ZipArchiveEntry[] EMPTY_ARRAY = {};
static LinkedList<ZipArchiveEntry> EMPTY_LINKED_LIST = new LinkedList<>();

public static final int PLATFORM_UNIX = 3;
public static final int PLATFORM_FAT = 0;
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,7 @@
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.StandardOpenOption;
import java.util.ArrayList;
import java.util.Arrays;
import java.util.Collections;
import java.util.Comparator;
Expand Down Expand Up @@ -368,9 +369,14 @@ public static void closeQuietly(final ZipFile zipFile) {
private final List<ZipArchiveEntry> entries = new LinkedList<>();

/**
* Maps String to list of ZipArchiveEntrys, name -> actual entries.
* Maps a string to the first entry by that name.
*/
private final Map<String, LinkedList<ZipArchiveEntry>> nameMap = new HashMap<>(HASH_SIZE);
private final Map<String, ZipArchiveEntry> nameMap = new HashMap<>(HASH_SIZE);

/**
* If multiple entries have the same name, maps the name to entries by that name.
*/
private Map<String, List<ZipArchiveEntry>> duplicateNameMap;
Comment on lines -373 to +379
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that this is extremely unlikely, would it make sense to keep the original Map and change the value to Object and let it hold either ZipArchiveEntry or the List? It would eventually complicate the lookup but that should be a single place only.

Copy link
Member

@garydgregory garydgregory Dec 31, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I REALLY don't like that. When you look at the pack200 code, which is written at the pre-generics Java version level, and some of its lists, you can't tell if some of the lists are buggy, sloppy, or trying to be clever. That or the lack of generics let's you get away with not having fully thought through the hierarchy of types you put in those lists.

Copy link
Contributor

@kvr000 kvr000 Jan 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's true, such optimizations typically come at cost of readability and safety. However, keeping the amount of code touching the data structures limited may help - we'll need only three methods - first-by-name, all-by-name and add-entry . And document the field to not touch except via those three methods :-)

Just a suggestion if it comes to true optimization, given this PR is purely about improving time and space performance. Note that the getEntries(name) is not slower only because of wrapping into a Collection but also because of performing double lookup.


/**
* The encoding to use for file names and the file comment.
Expand Down Expand Up @@ -749,8 +755,22 @@ private void fillNameMap() {
// entries are filled in populateFromCentralDirectory and
// never modified
final String name = ze.getName();
final LinkedList<ZipArchiveEntry> entriesOfThatName = nameMap.computeIfAbsent(name, k -> new LinkedList<>());
entriesOfThatName.addLast(ze);
final ZipArchiveEntry firstEntry = nameMap.putIfAbsent(name, ze);

if (firstEntry != null) {
if (duplicateNameMap == null) {
duplicateNameMap = new HashMap<>();
}

final List<ZipArchiveEntry> entriesOfThatName = duplicateNameMap.computeIfAbsent(name, k -> {
// Create a list when there are two entries with the same name
final ArrayList<ZipArchiveEntry> list = new ArrayList<>(2);
list.add(firstEntry);
return list;
});

entriesOfThatName.add(ze);
}
});
}

Expand Down Expand Up @@ -820,7 +840,13 @@ public Enumeration<ZipArchiveEntry> getEntries() {
* @since 1.6
*/
public Iterable<ZipArchiveEntry> getEntries(final String name) {
return nameMap.getOrDefault(name, ZipArchiveEntry.EMPTY_LINKED_LIST);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mentioned a potential 30% slowdown in this method in return for additional safety, which may or may not be a fair tradeoff. But I'm a bit concerned that this change potentially breaks the current behaviour. As you say, this currently returns a modifiable LinkedList, which some people may (though I hope not) depend on? Because it requires an explicit cast, I'm not too worried.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the slowdown is only in case there are duplicit entries which is extremely unlikely. I'd still simplify the code, as according to earlier comment.

final List<ZipArchiveEntry> entriesOfThatName = duplicateNameMap == null ? null : duplicateNameMap.get(name);
if (entriesOfThatName == null) {
final ZipArchiveEntry entry = nameMap.get(name);
return entry != null ? Collections.singletonList(entry) : Collections.emptyList();
} else {
return Collections.unmodifiableList(entriesOfThatName);
}
}

/**
Expand All @@ -847,8 +873,16 @@ public Enumeration<ZipArchiveEntry> getEntriesInPhysicalOrder() {
* @since 1.6
*/
public Iterable<ZipArchiveEntry> getEntriesInPhysicalOrder(final String name) {
final LinkedList<ZipArchiveEntry> linkedList = nameMap.getOrDefault(name, ZipArchiveEntry.EMPTY_LINKED_LIST);
return Arrays.asList(sortByOffset(linkedList.toArray(ZipArchiveEntry.EMPTY_ARRAY)));
if (duplicateNameMap != null) {
final List<ZipArchiveEntry> list = duplicateNameMap.get(name);
if (list != null) {
final ZipArchiveEntry[] entriesOfThatName = list.toArray(ZipArchiveEntry.EMPTY_ARRAY);
return Arrays.asList(sortByOffset(entriesOfThatName));
}
}

final ZipArchiveEntry entry = nameMap.get(name);
return entry != null ? Collections.singletonList(entry) : Collections.emptyList();
}

/**
Expand All @@ -862,8 +896,7 @@ public Iterable<ZipArchiveEntry> getEntriesInPhysicalOrder(final String name) {
* @return the ZipArchiveEntry corresponding to the given name - or {@code null} if not present.
*/
public ZipArchiveEntry getEntry(final String name) {
final LinkedList<ZipArchiveEntry> entries = nameMap.get(name);
return entries != null ? entries.getFirst() : null;
return nameMap.get(name);
}

/**
Expand Down