From 61d819174ed0495079b66b2ebf382ed71aa62101 Mon Sep 17 00:00:00 2001 From: "copilot-swe-agent[bot]" <198982749+Copilot@users.noreply.github.com> Date: Thu, 29 Jan 2026 18:22:04 +0000 Subject: [PATCH 1/2] Initial plan From 7e7d39ec7622c0c44edcecee2a93f5d337c49666 Mon Sep 17 00:00:00 2001 From: "copilot-swe-agent[bot]" <198982749+Copilot@users.noreply.github.com> Date: Thu, 29 Jan 2026 18:24:35 +0000 Subject: [PATCH 2/2] Add GitHub Copilot instructions for RecursiveExtractor repository Co-authored-by: gfs <98900+gfs@users.noreply.github.com> --- .github/copilot-instructions.md | 151 ++++++++++++++++++++++++++++++++ 1 file changed, 151 insertions(+) create mode 100644 .github/copilot-instructions.md diff --git a/.github/copilot-instructions.md b/.github/copilot-instructions.md new file mode 100644 index 0000000..90fc64c --- /dev/null +++ b/.github/copilot-instructions.md @@ -0,0 +1,151 @@ +# GitHub Copilot Instructions for RecursiveExtractor + +## Project Overview + +RecursiveExtractor is a cross-platform .NET library and CLI tool for parsing archive files and disk images, including nested archives. It provides a unified interface to extract arbitrary archives using libraries like SharpCompress and DiscUtils. + +## Tech Stack + +- **Language**: C# 10.0 +- **Target Frameworks**: .NET Standard 2.0, .NET Standard 2.1, .NET 8.0, .NET 9.0, .NET 10.0 +- **Testing Framework**: xUnit (based on project structure) +- **Key Dependencies**: SharpCompress, LTRData.DiscUtils, NLog, Glob + +## Building and Testing + +### Build Commands +```bash +# Build the entire solution +dotnet build RecursiveExtractor.sln + +# Build a specific project +dotnet build RecursiveExtractor/RecursiveExtractor.csproj +``` + +### Test Commands +```bash +# Run all tests +dotnet test RecursiveExtractor.sln + +# Run tests for a specific project +dotnet test RecursiveExtractor.Tests/RecursiveExtractor.Tests.csproj +dotnet test RecursiveExtractor.Cli.Tests/RecursiveExtractor.Cli.Tests.csproj +``` + +### Restore Packages +```bash +dotnet restore RecursiveExtractor.sln +``` + +## NuGet Configuration + +⚠️ **Important**: The repository uses a private NuGet feed configured in `nuget.config`: +- The `nuget.config` file points to a private Azure DevOps feed: `https://pkgs.dev.azure.com/microsoft-sdl/General/_packaging/PublicRegistriesFeed/nuget/v3/index.json` +- **When working as an agent, you may need to temporarily modify `nuget.config` to use public NuGet feeds** (e.g., `https://api.nuget.org/v3/index.json`) to restore packages successfully +- **ALWAYS restore the `nuget.config` to its original configuration before completing your work** +- The original configuration must be preserved to maintain consistency with the team's workflow + +Example of temporarily switching to public feed: +```xml + + + + + + + +``` + +## Code Style Guidelines + +### Follow .editorconfig Settings +- Use 4 spaces for indentation (no tabs) +- CRLF line endings +- Open braces on new lines +- Use `var` for local variables when type is apparent +- Follow PascalCase for types, methods, and properties +- Interfaces should begin with 'I' +- Do not use `this.` qualifier unless necessary + +### Naming Conventions +- **Interfaces**: Start with 'I' (e.g., `ICustomAsyncExtractor`) +- **Classes**: PascalCase (e.g., `FileEntry`, `Extractor`) +- **Methods**: PascalCase (e.g., `Extract`, `ExtractAsync`) +- **Properties**: PascalCase (e.g., `FullPath`, `Content`) +- **Parameters**: camelCase (e.g., `fileEntry`, `options`) + +### C# Best Practices +- Enable nullable reference types (project uses `Enable`) +- Prefer pattern matching over `as` with null checks +- Use expression-bodied members for simple properties and accessors +- Prefer `null` propagation (`?.`) when appropriate +- Use async/await for I/O operations +- Implement both synchronous and asynchronous versions of extraction methods + +## Testing Practices + +### Test Organization +- Unit tests go in `RecursiveExtractor.Tests` project +- CLI tests go in `RecursiveExtractor.Cli.Tests` project +- Use xUnit as the testing framework +- Test files should mirror the structure of source files + +### Test Naming +- Use descriptive test names that explain what is being tested +- Follow pattern: `MethodName_StateUnderTest_ExpectedBehavior` + +### Test Data +- Test archives and files should be placed in appropriate test data directories +- Include edge cases: nested archives, encrypted files, malformed content, zip bombs + +## Security Considerations + +- The library includes protections against ZipSlip, Quines, and Zip Bombs +- Always validate file paths to prevent directory traversal attacks +- Handle malformed archives gracefully without crashes +- Implement proper resource cleanup (dispose streams, file handles) + +## Documentation + +- Add XML documentation comments for public APIs +- Keep README.md updated with new features or changes +- Document breaking changes clearly +- Include code examples for new public APIs + +## Project Structure + +``` +RecursiveExtractor/ # Main library project +RecursiveExtractor.Tests/ # Unit tests for library +RecursiveExtractor.Cli/ # Command-line interface project +RecursiveExtractor.Cli.Tests/ # Tests for CLI +``` + +## Common Patterns + +### Extraction Pattern +- Use `Extractor` class as the main entry point +- Support both `Extract()` (sync) and `ExtractAsync()` (async) methods +- Return `IEnumerable` or `IAsyncEnumerable` +- Each `FileEntry` contains a Stream of content that should be disposed properly + +### Custom Extractors +- Implement `ICustomAsyncExtractor` for new archive formats +- Include `CanExtract()` method to detect file format via magic bytes +- Preserve stream position in `CanExtract()` +- Support both sync and async extraction + +### Error Handling +- Throw `OverflowException` for detected quines or zip bombs +- Throw `TimeoutException` when timing limits are exceeded +- Log errors and skip invalid files during extraction +- Use `ExtractSelfOnFail` option to return original archive on failure + +## Important Notes + +- Multi-targeting means code must be compatible with .NET Standard 2.0 +- Some features (like WIM support) are Windows-only +- The library automatically detects archive types +- Streams in FileEntry objects should be disposed by consumers +- Avoid multiple enumeration of extraction results +- For parallel processing, use batching mechanism as documented in README