22 Commits

Author SHA1 Message Date
Affaan Mustafa
5cb9c1c2a5 Merge pull request #223 from shimo4228/feat/skills/regex-vs-llm-structured-text
feat(skills): add regex-vs-llm-structured-text skill
2026-02-16 14:19:02 -08:00
Affaan Mustafa
595127954f Merge pull request #222 from shimo4228/feat/skills/content-hash-cache-pattern
feat(skills): add content-hash-cache-pattern skill
2026-02-16 14:18:59 -08:00
Affaan Mustafa
bb084229aa Merge pull request #221 from shimo4228/feat/skills/swift-actor-persistence
feat(skills): add swift-actor-persistence skill
2026-02-16 14:18:57 -08:00
Affaan Mustafa
849bb3b425 Merge pull request #220 from shimo4228/feat/skills/swift-protocol-di-testing
feat(skills): add swift-protocol-di-testing skill
2026-02-16 14:18:55 -08:00
Affaan Mustafa
4db215f60d Merge pull request #219 from shimo4228/feat/skills/cost-aware-llm-pipeline
feat(skills): add cost-aware-llm-pipeline skill
2026-02-16 14:18:53 -08:00
Affaan Mustafa
bb1486c404 Merge pull request #229 from voidforall/feat/cpp-coding-standards
feat: add cpp coding standards skill
2026-02-16 14:18:35 -08:00
Affaan Mustafa
9339d4c88c Merge pull request #228 from hrygo/fix/observe.sh-timestamp-export
fix: correct TIMESTAMP environment variable syntax in observe.sh
2026-02-16 14:18:13 -08:00
Affaan Mustafa
2497a9b6e5 Merge pull request #241 from sungpeo/mkdir-rules-readme
docs: require default directory creation for initial user-level rules
2026-02-16 14:18:10 -08:00
Affaan Mustafa
e449471ed3 Merge pull request #230 from neal-zhu/fix/whitelist-plan-files-in-doc-hook
fix: whitelist .claude/plans/ in doc file creation hook
2026-02-16 14:18:08 -08:00
Sungpeo Kook
cad8db21b7 docs: add mkdir for rules directory in ja-JP and zh-CN READMEs 2026-02-17 01:48:37 +09:00
Sungpeo Kook
9d9258c7e1 docs: require default directory creation for initial user-level rules 2026-02-17 01:32:56 +09:00
Maohua Zhu
40e80bcc61 fix: whitelist .claude/plans/ in doc file creation hook
The PreToolUse Write hook blocks creation of .md files to prevent
unnecessary documentation sprawl. However, it also blocks Claude Code's
built-in plan mode from writing plan files to .claude/plans/*.md,
causing "BLOCKED: Unnecessary documentation file creation" errors
every time a plan is created or updated.

Add .claude/plans/ path to the whitelist so plan files are not blocked.
2026-02-15 08:49:15 +08:00
Lin Yuan
eaf710847f fix syntax in illustrative example 2026-02-14 20:36:36 +00:00
voidforall
b169a2e1dd Update .cursor/skills/cpp-coding-standards/SKILL.md
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
2026-02-14 20:33:59 +00:00
Lin Yuan
8b4aac4e56 feat: add cpp-coding-standards skill to skills/ and update README 2026-02-14 20:26:33 +00:00
Lin Yuan
08f60355d4 feat: add cpp-coding-standards skill based on C++ Core Guidelines
Comprehensive coding standards for modern C++ (C++17/20/23) derived
from isocpp.github.io/CppCoreGuidelines. Covers philosophy, functions,
classes, resource management, expressions, error handling, immutability,
concurrency, templates, standard library, enumerations, naming, and
performance with DO/DON'T examples and a pre-commit checklist.
2026-02-14 20:20:27 +00:00
黄飞虹
1f74889dbf fix: correct TIMESTAMP environment variable syntax in observe.sh
The inline environment variable syntax `TIMESTAMP="$timestamp" echo ...` 
does not work correctly because:
1. The pipe creates a subshell that doesn't inherit the variable
2. The environment variable is set for echo, not for the piped python

Fixed by using `export` and separating the commands:
- export TIMESTAMP="$timestamp"
- echo "$INPUT_JSON" | python3 -c "..."

This ensures the TIMESTAMP variable is available to the python subprocess.

Fixes #227

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-14 21:25:50 +08:00
Tatsuya Shimomoto
82d751556c feat(skills): add regex-vs-llm-structured-text skill
Decision framework and hybrid pipeline for choosing between regex
and LLM when parsing structured text.
2026-02-14 12:33:03 +09:00
Tatsuya Shimomoto
3847cc0e0d feat(skills): add content-hash-cache-pattern skill
SHA-256 content-hash based file caching with service layer
separation for expensive processing pipelines.
2026-02-14 12:31:30 +09:00
Tatsuya Shimomoto
94eaaad238 feat(skills): add swift-actor-persistence skill
Thread-safe data persistence patterns using Swift actors with
in-memory cache and file-backed storage.
2026-02-14 12:30:22 +09:00
Tatsuya Shimomoto
ab5be936e9 feat(skills): add swift-protocol-di-testing skill
Protocol-based dependency injection patterns for testable Swift code
with Swift Testing framework examples.
2026-02-14 12:18:48 +09:00
Tatsuya Shimomoto
219bd1ff88 feat(skills): add cost-aware-llm-pipeline skill
Cost optimization patterns for LLM API usage combining model routing,
budget tracking, retry logic, and prompt caching.
2026-02-14 12:16:05 +09:00
12 changed files with 2345 additions and 3 deletions

View File

@@ -0,0 +1,722 @@
---
name: cpp-coding-standards
description: C++ coding standards based on the C++ Core Guidelines (isocpp.github.io). Use when writing, reviewing, or refactoring C++ code to enforce modern, safe, and idiomatic practices.
---
# C++ Coding Standards (C++ Core Guidelines)
Comprehensive coding standards for modern C++ (C++17/20/23) derived from the [C++ Core Guidelines](https://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines). Enforces type safety, resource safety, immutability, and clarity.
## When to Use
- Writing new C++ code (classes, functions, templates)
- Reviewing or refactoring existing C++ code
- Making architectural decisions in C++ projects
- Enforcing consistent style across a C++ codebase
- Choosing between language features (e.g., `enum` vs `enum class`, raw pointer vs smart pointer)
### When NOT to Use
- Non-C++ projects
- Legacy C codebases that cannot adopt modern C++ features
- Embedded/bare-metal contexts where specific guidelines conflict with hardware constraints (adapt selectively)
## Cross-Cutting Principles
These themes recur across the entire guidelines and form the foundation:
1. **RAII everywhere** (P.8, R.1, E.6, CP.20): Bind resource lifetime to object lifetime
2. **Immutability by default** (P.10, Con.1-5, ES.25): Start with `const`/`constexpr`; mutability is the exception
3. **Type safety** (P.4, I.4, ES.46-49, Enum.3): Use the type system to prevent errors at compile time
4. **Express intent** (P.3, F.1, NL.1-2, T.10): Names, types, and concepts should communicate purpose
5. **Minimize complexity** (F.2-3, ES.5, Per.4-5): Simple code is correct code
6. **Value semantics over pointer semantics** (C.10, R.3-5, F.20, CP.31): Prefer returning by value and scoped objects
## Philosophy & Interfaces (P.*, I.*)
### Key Rules
| Rule | Summary |
|------|---------|
| **P.1** | Express ideas directly in code |
| **P.3** | Express intent |
| **P.4** | Ideally, a program should be statically type safe |
| **P.5** | Prefer compile-time checking to run-time checking |
| **P.8** | Don't leak any resources |
| **P.10** | Prefer immutable data to mutable data |
| **I.1** | Make interfaces explicit |
| **I.2** | Avoid non-const global variables |
| **I.4** | Make interfaces precisely and strongly typed |
| **I.11** | Never transfer ownership by a raw pointer or reference |
| **I.23** | Keep the number of function arguments low |
### DO
```cpp
// P.10 + I.4: Immutable, strongly typed interface
struct Temperature {
double kelvin;
};
Temperature boil(const Temperature& water);
```
### DON'T
```cpp
// Weak interface: unclear ownership, unclear units
double boil(double* temp);
// Non-const global variable
int g_counter = 0; // I.2 violation
```
## Functions (F.*)
### Key Rules
| Rule | Summary |
|------|---------|
| **F.1** | Package meaningful operations as carefully named functions |
| **F.2** | A function should perform a single logical operation |
| **F.3** | Keep functions short and simple |
| **F.4** | If a function might be evaluated at compile time, declare it `constexpr` |
| **F.6** | If your function must not throw, declare it `noexcept` |
| **F.8** | Prefer pure functions |
| **F.16** | For "in" parameters, pass cheaply-copied types by value and others by `const&` |
| **F.20** | For "out" values, prefer return values to output parameters |
| **F.21** | To return multiple "out" values, prefer returning a struct |
| **F.43** | Never return a pointer or reference to a local object |
### Parameter Passing
```cpp
// F.16: Cheap types by value, others by const&
void print(int x); // cheap: by value
void analyze(const std::string& data); // expensive: by const&
void transform(std::string s); // sink: by value (will move)
// F.20 + F.21: Return values, not output parameters
struct ParseResult {
std::string token;
int position;
};
ParseResult parse(std::string_view input); // GOOD: return struct
// BAD: output parameters
void parse(std::string_view input,
std::string& token, int& pos); // avoid this
```
### Pure Functions and constexpr
```cpp
// F.4 + F.8: Pure, constexpr where possible
constexpr int factorial(int n) noexcept {
return (n <= 1) ? 1 : n * factorial(n - 1);
}
static_assert(factorial(5) == 120);
```
### Anti-Patterns
- Returning `T&&` from functions (F.45)
- Using `va_arg` / C-style variadics (F.55)
- Capturing by reference in lambdas passed to other threads (F.53)
- Returning `const T` which inhibits move semantics (F.49)
## Classes & Class Hierarchies (C.*)
### Key Rules
| Rule | Summary |
|------|---------|
| **C.2** | Use `class` if invariant exists; `struct` if data members vary independently |
| **C.9** | Minimize exposure of members |
| **C.20** | If you can avoid defining default operations, do (Rule of Zero) |
| **C.21** | If you define or `=delete` any copy/move/destructor, handle them all (Rule of Five) |
| **C.35** | Base class destructor: public virtual or protected non-virtual |
| **C.41** | A constructor should create a fully initialized object |
| **C.46** | Declare single-argument constructors `explicit` |
| **C.67** | A polymorphic class should suppress public copy/move |
| **C.128** | Virtual functions: specify exactly one of `virtual`, `override`, or `final` |
### Rule of Zero
```cpp
// C.20: Let the compiler generate special members
struct Employee {
std::string name;
std::string department;
int id;
// No destructor, copy/move constructors, or assignment operators needed
};
```
### Rule of Five
```cpp
// C.21: If you must manage a resource, define all five
class Buffer {
public:
explicit Buffer(std::size_t size)
: data_(std::make_unique<char[]>(size)), size_(size) {}
~Buffer() = default;
Buffer(const Buffer& other)
: data_(std::make_unique<char[]>(other.size_)), size_(other.size_) {
std::copy_n(other.data_.get(), size_, data_.get());
}
Buffer& operator=(const Buffer& other) {
if (this != &other) {
auto new_data = std::make_unique<char[]>(other.size_);
std::copy_n(other.data_.get(), other.size_, new_data.get());
data_ = std::move(new_data);
size_ = other.size_;
}
return *this;
}
Buffer(Buffer&&) noexcept = default;
Buffer& operator=(Buffer&&) noexcept = default;
private:
std::unique_ptr<char[]> data_;
std::size_t size_;
};
```
### Class Hierarchy
```cpp
// C.35 + C.128: Virtual destructor, use override
class Shape {
public:
virtual ~Shape() = default;
virtual double area() const = 0; // C.121: pure interface
};
class Circle : public Shape {
public:
explicit Circle(double r) : radius_(r) {}
double area() const override { return 3.14159 * radius_ * radius_; }
private:
double radius_;
};
```
### Anti-Patterns
- Calling virtual functions in constructors/destructors (C.82)
- Using `memset`/`memcpy` on non-trivial types (C.90)
- Providing different default arguments for virtual function and overrider (C.140)
- Making data members `const` or references, which suppresses move/copy (C.12)
## Resource Management (R.*)
### Key Rules
| Rule | Summary |
|------|---------|
| **R.1** | Manage resources automatically using RAII |
| **R.3** | A raw pointer (`T*`) is non-owning |
| **R.5** | Prefer scoped objects; don't heap-allocate unnecessarily |
| **R.10** | Avoid `malloc()`/`free()` |
| **R.11** | Avoid calling `new` and `delete` explicitly |
| **R.20** | Use `unique_ptr` or `shared_ptr` to represent ownership |
| **R.21** | Prefer `unique_ptr` over `shared_ptr` unless sharing ownership |
| **R.22** | Use `make_shared()` to make `shared_ptr`s |
### Smart Pointer Usage
```cpp
// R.11 + R.20 + R.21: RAII with smart pointers
auto widget = std::make_unique<Widget>("config"); // unique ownership
auto cache = std::make_shared<Cache>(1024); // shared ownership
// R.3: Raw pointer = non-owning observer
void render(const Widget* w) { // does NOT own w
if (w) w->draw();
}
render(widget.get());
```
### RAII Pattern
```cpp
// R.1: Resource acquisition is initialization
class FileHandle {
public:
explicit FileHandle(const std::string& path)
: handle_(std::fopen(path.c_str(), "r")) {
if (!handle_) throw std::runtime_error("Failed to open: " + path);
}
~FileHandle() {
if (handle_) std::fclose(handle_);
}
FileHandle(const FileHandle&) = delete;
FileHandle& operator=(const FileHandle&) = delete;
FileHandle(FileHandle&& other) noexcept
: handle_(std::exchange(other.handle_, nullptr)) {}
FileHandle& operator=(FileHandle&& other) noexcept {
if (this != &other) {
if (handle_) std::fclose(handle_);
handle_ = std::exchange(other.handle_, nullptr);
}
return *this;
}
private:
std::FILE* handle_;
};
```
### Anti-Patterns
- Naked `new`/`delete` (R.11)
- `malloc()`/`free()` in C++ code (R.10)
- Multiple resource allocations in a single expression (R.13 -- exception safety hazard)
- `shared_ptr` where `unique_ptr` suffices (R.21)
## Expressions & Statements (ES.*)
### Key Rules
| Rule | Summary |
|------|---------|
| **ES.5** | Keep scopes small |
| **ES.20** | Always initialize an object |
| **ES.23** | Prefer `{}` initializer syntax |
| **ES.25** | Declare objects `const` or `constexpr` unless modification is intended |
| **ES.28** | Use lambdas for complex initialization of `const` variables |
| **ES.45** | Avoid magic constants; use symbolic constants |
| **ES.46** | Avoid narrowing/lossy arithmetic conversions |
| **ES.47** | Use `nullptr` rather than `0` or `NULL` |
| **ES.48** | Avoid casts |
| **ES.50** | Don't cast away `const` |
### Initialization
```cpp
// ES.20 + ES.23 + ES.25: Always initialize, prefer {}, default to const
const int max_retries{3};
const std::string name{"widget"};
const std::vector<int> primes{2, 3, 5, 7, 11};
// ES.28: Lambda for complex const initialization
const auto config = [&] {
Config c;
c.timeout = std::chrono::seconds{30};
c.retries = max_retries;
c.verbose = debug_mode;
return c;
}();
```
### Anti-Patterns
- Uninitialized variables (ES.20)
- Using `0` or `NULL` as pointer (ES.47 -- use `nullptr`)
- C-style casts (ES.48 -- use `static_cast`, `const_cast`, etc.)
- Casting away `const` (ES.50)
- Magic numbers without named constants (ES.45)
- Mixing signed and unsigned arithmetic (ES.100)
- Reusing names in nested scopes (ES.12)
## Error Handling (E.*)
### Key Rules
| Rule | Summary |
|------|---------|
| **E.1** | Develop an error-handling strategy early in a design |
| **E.2** | Throw an exception to signal that a function can't perform its assigned task |
| **E.6** | Use RAII to prevent leaks |
| **E.12** | Use `noexcept` when throwing is impossible or unacceptable |
| **E.14** | Use purpose-designed user-defined types as exceptions |
| **E.15** | Throw by value, catch by reference |
| **E.16** | Destructors, deallocation, and swap must never fail |
| **E.17** | Don't try to catch every exception in every function |
### Exception Hierarchy
```cpp
// E.14 + E.15: Custom exception types, throw by value, catch by reference
class AppError : public std::runtime_error {
public:
using std::runtime_error::runtime_error;
};
class NetworkError : public AppError {
public:
NetworkError(const std::string& msg, int code)
: AppError(msg), status_code(code) {}
int status_code;
};
void fetch_data(const std::string& url) {
// E.2: Throw to signal failure
throw NetworkError("connection refused", 503);
}
void run() {
try {
fetch_data("https://api.example.com");
} catch (const NetworkError& e) {
log_error(e.what(), e.status_code);
} catch (const AppError& e) {
log_error(e.what());
}
// E.17: Don't catch everything here -- let unexpected errors propagate
}
```
### Anti-Patterns
- Throwing built-in types like `int` or string literals (E.14)
- Catching by value (slicing risk) (E.15)
- Empty catch blocks that silently swallow errors
- Using exceptions for flow control (E.3)
- Error handling based on global state like `errno` (E.28)
## Constants & Immutability (Con.*)
### All Rules
| Rule | Summary |
|------|---------|
| **Con.1** | By default, make objects immutable |
| **Con.2** | By default, make member functions `const` |
| **Con.3** | By default, pass pointers and references to `const` |
| **Con.4** | Use `const` for values that don't change after construction |
| **Con.5** | Use `constexpr` for values computable at compile time |
```cpp
// Con.1 through Con.5: Immutability by default
class Sensor {
public:
explicit Sensor(std::string id) : id_(std::move(id)) {}
// Con.2: const member functions by default
const std::string& id() const { return id_; }
double last_reading() const { return reading_; }
// Only non-const when mutation is required
void record(double value) { reading_ = value; }
private:
const std::string id_; // Con.4: never changes after construction
double reading_{0.0};
};
// Con.3: Pass by const reference
void display(const Sensor& s) {
std::cout << s.id() << ": " << s.last_reading() << '\n';
}
// Con.5: Compile-time constants
constexpr double PI = 3.14159265358979;
constexpr int MAX_SENSORS = 256;
```
## Concurrency & Parallelism (CP.*)
### Key Rules
| Rule | Summary |
|------|---------|
| **CP.2** | Avoid data races |
| **CP.3** | Minimize explicit sharing of writable data |
| **CP.4** | Think in terms of tasks, rather than threads |
| **CP.8** | Don't use `volatile` for synchronization |
| **CP.20** | Use RAII, never plain `lock()`/`unlock()` |
| **CP.21** | Use `std::scoped_lock` to acquire multiple mutexes |
| **CP.22** | Never call unknown code while holding a lock |
| **CP.42** | Don't wait without a condition |
| **CP.44** | Remember to name your `lock_guard`s and `unique_lock`s |
| **CP.100** | Don't use lock-free programming unless you absolutely have to |
### Safe Locking
```cpp
// CP.20 + CP.44: RAII locks, always named
class ThreadSafeQueue {
public:
void push(int value) {
std::lock_guard<std::mutex> lock(mutex_); // CP.44: named!
queue_.push(value);
cv_.notify_one();
}
int pop() {
std::unique_lock<std::mutex> lock(mutex_);
// CP.42: Always wait with a condition
cv_.wait(lock, [this] { return !queue_.empty(); });
const int value = queue_.front();
queue_.pop();
return value;
}
private:
std::mutex mutex_; // CP.50: mutex with its data
std::condition_variable cv_;
std::queue<int> queue_;
};
```
### Multiple Mutexes
```cpp
// CP.21: std::scoped_lock for multiple mutexes (deadlock-free)
void transfer(Account& from, Account& to, double amount) {
std::scoped_lock lock(from.mutex_, to.mutex_);
from.balance_ -= amount;
to.balance_ += amount;
}
```
### Anti-Patterns
- `volatile` for synchronization (CP.8 -- it's for hardware I/O only)
- Detaching threads (CP.26 -- lifetime management becomes nearly impossible)
- Unnamed lock guards: `std::lock_guard<std::mutex>(m);` destroys immediately (CP.44)
- Holding locks while calling callbacks (CP.22 -- deadlock risk)
- Lock-free programming without deep expertise (CP.100)
## Templates & Generic Programming (T.*)
### Key Rules
| Rule | Summary |
|------|---------|
| **T.1** | Use templates to raise the level of abstraction |
| **T.2** | Use templates to express algorithms for many argument types |
| **T.10** | Specify concepts for all template arguments |
| **T.11** | Use standard concepts whenever possible |
| **T.13** | Prefer shorthand notation for simple concepts |
| **T.43** | Prefer `using` over `typedef` |
| **T.120** | Use template metaprogramming only when you really need to |
| **T.144** | Don't specialize function templates (overload instead) |
### Concepts (C++20)
```cpp
#include <concepts>
// T.10 + T.11: Constrain templates with standard concepts
template<std::integral T>
T gcd(T a, T b) {
while (b != 0) {
a = std::exchange(b, a % b);
}
return a;
}
// T.13: Shorthand concept syntax
void sort(std::ranges::random_access_range auto& range) {
std::ranges::sort(range);
}
// Custom concept for domain-specific constraints
template<typename T>
concept Serializable = requires(const T& t) {
{ t.serialize() } -> std::convertible_to<std::string>;
};
template<Serializable T>
void save(const T& obj, const std::string& path);
```
### Anti-Patterns
- Unconstrained templates in visible namespaces (T.47)
- Specializing function templates instead of overloading (T.144)
- Template metaprogramming where `constexpr` suffices (T.120)
- `typedef` instead of `using` (T.43)
## Standard Library (SL.*)
### Key Rules
| Rule | Summary |
|------|---------|
| **SL.1** | Use libraries wherever possible |
| **SL.2** | Prefer the standard library to other libraries |
| **SL.con.1** | Prefer `std::array` or `std::vector` over C arrays |
| **SL.con.2** | Prefer `std::vector` by default |
| **SL.str.1** | Use `std::string` to own character sequences |
| **SL.str.2** | Use `std::string_view` to refer to character sequences |
| **SL.io.50** | Avoid `endl` (use `'\n'` -- `endl` forces a flush) |
```cpp
// SL.con.1 + SL.con.2: Prefer vector/array over C arrays
const std::array<int, 4> fixed_data{1, 2, 3, 4};
std::vector<std::string> dynamic_data;
// SL.str.1 + SL.str.2: string owns, string_view observes
std::string build_greeting(std::string_view name) {
return "Hello, " + std::string(name) + "!";
}
// SL.io.50: Use '\n' not endl
std::cout << "result: " << value << '\n';
```
## Enumerations (Enum.*)
### Key Rules
| Rule | Summary |
|------|---------|
| **Enum.1** | Prefer enumerations over macros |
| **Enum.3** | Prefer `enum class` over plain `enum` |
| **Enum.5** | Don't use ALL_CAPS for enumerators |
| **Enum.6** | Avoid unnamed enumerations |
```cpp
// Enum.3 + Enum.5: Scoped enum, no ALL_CAPS
enum class Color { red, green, blue };
enum class LogLevel { debug, info, warning, error };
// BAD: plain enum leaks names, ALL_CAPS clashes with macros
enum { RED, GREEN, BLUE }; // Enum.3 + Enum.5 + Enum.6 violation
#define MAX_SIZE 100 // Enum.1 violation -- use constexpr
```
## Source Files & Naming (SF.*, NL.*)
### Key Rules
| Rule | Summary |
|------|---------|
| **SF.1** | Use `.cpp` for code files and `.h` for interface files |
| **SF.7** | Don't write `using namespace` at global scope in a header |
| **SF.8** | Use `#include` guards for all `.h` files |
| **SF.11** | Header files should be self-contained |
| **NL.5** | Avoid encoding type information in names (no Hungarian notation) |
| **NL.8** | Use a consistent naming style |
| **NL.9** | Use ALL_CAPS for macro names only |
| **NL.10** | Prefer `underscore_style` names |
### Header Guard
```cpp
// SF.8: Include guard (or #pragma once)
#ifndef PROJECT_MODULE_WIDGET_H
#define PROJECT_MODULE_WIDGET_H
// SF.11: Self-contained -- include everything this header needs
#include <string>
#include <vector>
namespace project::module {
class Widget {
public:
explicit Widget(std::string name);
const std::string& name() const;
private:
std::string name_;
};
} // namespace project::module
#endif // PROJECT_MODULE_WIDGET_H
```
### Naming Conventions
```cpp
// NL.8 + NL.10: Consistent underscore_style
namespace my_project {
constexpr int max_buffer_size = 4096; // NL.9: not ALL_CAPS (it's not a macro)
class tcp_connection { // underscore_style class
public:
void send_message(std::string_view msg);
bool is_connected() const;
private:
std::string host_; // trailing underscore for members
int port_;
};
} // namespace my_project
```
### Anti-Patterns
- `using namespace std;` in a header at global scope (SF.7)
- Headers that depend on inclusion order (SF.10, SF.11)
- Hungarian notation like `strName`, `iCount` (NL.5)
- ALL_CAPS for anything other than macros (NL.9)
## Performance (Per.*)
### Key Rules
| Rule | Summary |
|------|---------|
| **Per.1** | Don't optimize without reason |
| **Per.2** | Don't optimize prematurely |
| **Per.6** | Don't make claims about performance without measurements |
| **Per.7** | Design to enable optimization |
| **Per.10** | Rely on the static type system |
| **Per.11** | Move computation from run time to compile time |
| **Per.19** | Access memory predictably |
### Guidelines
```cpp
// Per.11: Compile-time computation where possible
constexpr auto lookup_table = [] {
std::array<int, 256> table{};
for (int i = 0; i < 256; ++i) {
table[i] = i * i;
}
return table;
}();
// Per.19: Prefer contiguous data for cache-friendliness
std::vector<Point> points; // GOOD: contiguous
std::vector<std::unique_ptr<Point>> indirect_points; // BAD: pointer chasing
```
### Anti-Patterns
- Optimizing without profiling data (Per.1, Per.6)
- Choosing "clever" low-level code over clear abstractions (Per.4, Per.5)
- Ignoring data layout and cache behavior (Per.19)
## Quick Reference Checklist
Before marking C++ work complete:
- [ ] No raw `new`/`delete` -- use smart pointers or RAII (R.11)
- [ ] Objects initialized at declaration (ES.20)
- [ ] Variables are `const`/`constexpr` by default (Con.1, ES.25)
- [ ] Member functions are `const` where possible (Con.2)
- [ ] `enum class` instead of plain `enum` (Enum.3)
- [ ] `nullptr` instead of `0`/`NULL` (ES.47)
- [ ] No narrowing conversions (ES.46)
- [ ] No C-style casts (ES.48)
- [ ] Single-argument constructors are `explicit` (C.46)
- [ ] Rule of Zero or Rule of Five applied (C.20, C.21)
- [ ] Base class destructors are public virtual or protected non-virtual (C.35)
- [ ] Templates are constrained with concepts (T.10)
- [ ] No `using namespace` in headers at global scope (SF.7)
- [ ] Headers have include guards and are self-contained (SF.8, SF.11)
- [ ] Locks use RAII (`scoped_lock`/`lock_guard`) (CP.20)
- [ ] Exceptions are custom types, thrown by value, caught by reference (E.14, E.15)
- [ ] `'\n'` instead of `std::endl` (SL.io.50)
- [ ] No magic numbers (ES.45)

View File

@@ -222,6 +222,7 @@ everything-claude-code/
| |-- verification-loop/ # Continuous verification (Longform Guide)
| |-- golang-patterns/ # Go idioms and best practices
| |-- golang-testing/ # Go testing patterns, TDD, benchmarks
| |-- cpp-coding-standards/ # C++ coding standards from C++ Core Guidelines (NEW)
| |-- cpp-testing/ # C++ testing with GoogleTest, CMake/CTest (NEW)
| |-- django-patterns/ # Django patterns, models, views (NEW)
| |-- django-security/ # Django security best practices (NEW)
@@ -486,6 +487,7 @@ This gives you instant access to all commands, agents, skills, and hooks.
> git clone https://github.com/affaan-m/everything-claude-code.git
>
> # Option A: User-level rules (applies to all projects)
> mkdir -p ~/.claude/rules
> cp -r everything-claude-code/rules/common/* ~/.claude/rules/
> cp -r everything-claude-code/rules/typescript/* ~/.claude/rules/ # pick your stack
> cp -r everything-claude-code/rules/python/* ~/.claude/rules/

View File

@@ -454,6 +454,7 @@ Duplicate hooks file detected: ./hooks/hooks.json resolves to already-loaded fil
> git clone https://github.com/affaan-m/everything-claude-code.git
>
> # オプション Aユーザーレベルルールすべてのプロジェクトに適用
> mkdir -p ~/.claude/rules
> cp -r everything-claude-code/rules/common/* ~/.claude/rules/
> cp -r everything-claude-code/rules/typescript/* ~/.claude/rules/ # スタックを選択
> cp -r everything-claude-code/rules/python/* ~/.claude/rules/

View File

@@ -456,6 +456,7 @@ Duplicate hooks file detected: ./hooks/hooks.json resolves to already-loaded fil
> git clone https://github.com/affaan-m/everything-claude-code.git
>
> # 选项 A用户级规则适用于所有项目
> mkdir -p ~/.claude/rules
> cp -r everything-claude-code/rules/common/* ~/.claude/rules/
> cp -r everything-claude-code/rules/typescript/* ~/.claude/rules/ # 选择您的技术栈
> cp -r everything-claude-code/rules/python/* ~/.claude/rules/

View File

@@ -37,7 +37,7 @@
"hooks": [
{
"type": "command",
"command": "node -e \"const fs=require('fs');let d='';process.stdin.on('data',c=>d+=c);process.stdin.on('end',()=>{try{const i=JSON.parse(d);const p=i.tool_input?.file_path||'';if(/\\.(md|txt)$/.test(p)&&!/(README|CLAUDE|AGENTS|CONTRIBUTING)\\.md$/.test(p)){console.error('[Hook] BLOCKED: Unnecessary documentation file creation');console.error('[Hook] File: '+p);console.error('[Hook] Use README.md for documentation instead');process.exit(2)}}catch{}console.log(d)})\""
"command": "node -e \"const fs=require('fs');let d='';process.stdin.on('data',c=>d+=c);process.stdin.on('end',()=>{try{const i=JSON.parse(d);const p=i.tool_input?.file_path||'';if(/\\.(md|txt)$/.test(p)&&!/(README|CLAUDE|AGENTS|CONTRIBUTING)\\.md$/.test(p)&&!/\\.claude\\/plans\\//.test(p)){console.error('[Hook] BLOCKED: Unnecessary documentation file creation');console.error('[Hook] File: '+p);console.error('[Hook] Use README.md for documentation instead');process.exit(2)}}catch{}console.log(d)})\""
}
],
"description": "Block creation of random .md files - keeps docs consolidated"

View File

@@ -0,0 +1,160 @@
---
name: content-hash-cache-pattern
description: Cache expensive file processing results using SHA-256 content hashes — path-independent, auto-invalidating, with service layer separation.
---
# Content-Hash File Cache Pattern
Cache expensive file processing results (PDF parsing, text extraction, image analysis) using SHA-256 content hashes as cache keys. Unlike path-based caching, this approach survives file moves/renames and auto-invalidates when content changes.
## When to Activate
- Building file processing pipelines (PDF, images, text extraction)
- Processing cost is high and same files are processed repeatedly
- Need a `--cache/--no-cache` CLI option
- Want to add caching to existing pure functions without modifying them
## Core Pattern
### 1. Content-Hash Based Cache Key
Use file content (not path) as the cache key:
```python
import hashlib
from pathlib import Path
_HASH_CHUNK_SIZE = 65536 # 64KB chunks for large files
def compute_file_hash(path: Path) -> str:
"""SHA-256 of file contents (chunked for large files)."""
if not path.is_file():
raise FileNotFoundError(f"File not found: {path}")
sha256 = hashlib.sha256()
with open(path, "rb") as f:
while True:
chunk = f.read(_HASH_CHUNK_SIZE)
if not chunk:
break
sha256.update(chunk)
return sha256.hexdigest()
```
**Why content hash?** File rename/move = cache hit. Content change = automatic invalidation. No index file needed.
### 2. Frozen Dataclass for Cache Entry
```python
from dataclasses import dataclass
@dataclass(frozen=True, slots=True)
class CacheEntry:
file_hash: str
source_path: str
document: ExtractedDocument # The cached result
```
### 3. File-Based Cache Storage
Each cache entry is stored as `{hash}.json` — O(1) lookup by hash, no index file required.
```python
import json
from typing import Any
def write_cache(cache_dir: Path, entry: CacheEntry) -> None:
cache_dir.mkdir(parents=True, exist_ok=True)
cache_file = cache_dir / f"{entry.file_hash}.json"
data = serialize_entry(entry)
cache_file.write_text(json.dumps(data, ensure_ascii=False), encoding="utf-8")
def read_cache(cache_dir: Path, file_hash: str) -> CacheEntry | None:
cache_file = cache_dir / f"{file_hash}.json"
if not cache_file.is_file():
return None
try:
raw = cache_file.read_text(encoding="utf-8")
data = json.loads(raw)
return deserialize_entry(data)
except (json.JSONDecodeError, ValueError, KeyError):
return None # Treat corruption as cache miss
```
### 4. Service Layer Wrapper (SRP)
Keep the processing function pure. Add caching as a separate service layer.
```python
def extract_with_cache(
file_path: Path,
*,
cache_enabled: bool = True,
cache_dir: Path = Path(".cache"),
) -> ExtractedDocument:
"""Service layer: cache check -> extraction -> cache write."""
if not cache_enabled:
return extract_text(file_path) # Pure function, no cache knowledge
file_hash = compute_file_hash(file_path)
# Check cache
cached = read_cache(cache_dir, file_hash)
if cached is not None:
logger.info("Cache hit: %s (hash=%s)", file_path.name, file_hash[:12])
return cached.document
# Cache miss -> extract -> store
logger.info("Cache miss: %s (hash=%s)", file_path.name, file_hash[:12])
doc = extract_text(file_path)
entry = CacheEntry(file_hash=file_hash, source_path=str(file_path), document=doc)
write_cache(cache_dir, entry)
return doc
```
## Key Design Decisions
| Decision | Rationale |
|----------|-----------|
| SHA-256 content hash | Path-independent, auto-invalidates on content change |
| `{hash}.json` file naming | O(1) lookup, no index file needed |
| Service layer wrapper | SRP: extraction stays pure, cache is a separate concern |
| Manual JSON serialization | Full control over frozen dataclass serialization |
| Corruption returns `None` | Graceful degradation, re-processes on next run |
| `cache_dir.mkdir(parents=True)` | Lazy directory creation on first write |
## Best Practices
- **Hash content, not paths** — paths change, content identity doesn't
- **Chunk large files** when hashing — avoid loading entire files into memory
- **Keep processing functions pure** — they should know nothing about caching
- **Log cache hit/miss** with truncated hashes for debugging
- **Handle corruption gracefully** — treat invalid cache entries as misses, never crash
## Anti-Patterns to Avoid
```python
# BAD: Path-based caching (breaks on file move/rename)
cache = {"/path/to/file.pdf": result}
# BAD: Adding cache logic inside the processing function (SRP violation)
def extract_text(path, *, cache_enabled=False, cache_dir=None):
if cache_enabled: # Now this function has two responsibilities
...
# BAD: Using dataclasses.asdict() with nested frozen dataclasses
# (can cause issues with complex nested types)
data = dataclasses.asdict(entry) # Use manual serialization instead
```
## When to Use
- File processing pipelines (PDF parsing, OCR, text extraction, image analysis)
- CLI tools that benefit from `--cache/--no-cache` options
- Batch processing where the same files appear across runs
- Adding caching to existing pure functions without modifying them
## When NOT to Use
- Data that must always be fresh (real-time feeds)
- Cache entries that would be extremely large (consider streaming instead)
- Results that depend on parameters beyond file content (e.g., different extraction configs)

View File

@@ -103,7 +103,8 @@ PARSED_OK=$(echo "$PARSED" | python3 -c "import json,sys; print(json.load(sys.st
if [ "$PARSED_OK" != "True" ]; then
# Fallback: log raw input for debugging
timestamp=$(date -u +"%Y-%m-%dT%H:%M:%SZ")
TIMESTAMP="$timestamp" echo "$INPUT_JSON" | python3 -c "
export TIMESTAMP="$timestamp"
echo "$INPUT_JSON" | python3 -c "
import json, sys, os
raw = sys.stdin.read()[:2000]
print(json.dumps({'timestamp': os.environ['TIMESTAMP'], 'event': 'parse_error', 'raw': raw}))
@@ -124,7 +125,8 @@ fi
# Build and write observation
timestamp=$(date -u +"%Y-%m-%dT%H:%M:%SZ")
TIMESTAMP="$timestamp" echo "$PARSED" | python3 -c "
export TIMESTAMP="$timestamp"
echo "$PARSED" | python3 -c "
import json, sys, os
parsed = json.load(sys.stdin)

View File

@@ -0,0 +1,182 @@
---
name: cost-aware-llm-pipeline
description: Cost optimization patterns for LLM API usage — model routing by task complexity, budget tracking, retry logic, and prompt caching.
---
# Cost-Aware LLM Pipeline
Patterns for controlling LLM API costs while maintaining quality. Combines model routing, budget tracking, retry logic, and prompt caching into a composable pipeline.
## When to Activate
- Building applications that call LLM APIs (Claude, GPT, etc.)
- Processing batches of items with varying complexity
- Need to stay within a budget for API spend
- Optimizing cost without sacrificing quality on complex tasks
## Core Concepts
### 1. Model Routing by Task Complexity
Automatically select cheaper models for simple tasks, reserving expensive models for complex ones.
```python
MODEL_SONNET = "claude-sonnet-4-5-20250929"
MODEL_HAIKU = "claude-haiku-4-5-20251001"
_SONNET_TEXT_THRESHOLD = 10_000 # chars
_SONNET_ITEM_THRESHOLD = 30 # items
def select_model(
text_length: int,
item_count: int,
force_model: str | None = None,
) -> str:
"""Select model based on task complexity."""
if force_model is not None:
return force_model
if text_length >= _SONNET_TEXT_THRESHOLD or item_count >= _SONNET_ITEM_THRESHOLD:
return MODEL_SONNET # Complex task
return MODEL_HAIKU # Simple task (3-4x cheaper)
```
### 2. Immutable Cost Tracking
Track cumulative spend with frozen dataclasses. Each API call returns a new tracker — never mutates state.
```python
from dataclasses import dataclass
@dataclass(frozen=True, slots=True)
class CostRecord:
model: str
input_tokens: int
output_tokens: int
cost_usd: float
@dataclass(frozen=True, slots=True)
class CostTracker:
budget_limit: float = 1.00
records: tuple[CostRecord, ...] = ()
def add(self, record: CostRecord) -> "CostTracker":
"""Return new tracker with added record (never mutates self)."""
return CostTracker(
budget_limit=self.budget_limit,
records=(*self.records, record),
)
@property
def total_cost(self) -> float:
return sum(r.cost_usd for r in self.records)
@property
def over_budget(self) -> bool:
return self.total_cost > self.budget_limit
```
### 3. Narrow Retry Logic
Retry only on transient errors. Fail fast on authentication or bad request errors.
```python
from anthropic import (
APIConnectionError,
InternalServerError,
RateLimitError,
)
_RETRYABLE_ERRORS = (APIConnectionError, RateLimitError, InternalServerError)
_MAX_RETRIES = 3
def call_with_retry(func, *, max_retries: int = _MAX_RETRIES):
"""Retry only on transient errors, fail fast on others."""
for attempt in range(max_retries):
try:
return func()
except _RETRYABLE_ERRORS:
if attempt == max_retries - 1:
raise
time.sleep(2 ** attempt) # Exponential backoff
# AuthenticationError, BadRequestError etc. → raise immediately
```
### 4. Prompt Caching
Cache long system prompts to avoid resending them on every request.
```python
messages = [
{
"role": "user",
"content": [
{
"type": "text",
"text": system_prompt,
"cache_control": {"type": "ephemeral"}, # Cache this
},
{
"type": "text",
"text": user_input, # Variable part
},
],
}
]
```
## Composition
Combine all four techniques in a single pipeline function:
```python
def process(text: str, config: Config, tracker: CostTracker) -> tuple[Result, CostTracker]:
# 1. Route model
model = select_model(len(text), estimated_items, config.force_model)
# 2. Check budget
if tracker.over_budget:
raise BudgetExceededError(tracker.total_cost, tracker.budget_limit)
# 3. Call with retry + caching
response = call_with_retry(lambda: client.messages.create(
model=model,
messages=build_cached_messages(system_prompt, text),
))
# 4. Track cost (immutable)
record = CostRecord(model=model, input_tokens=..., output_tokens=..., cost_usd=...)
tracker = tracker.add(record)
return parse_result(response), tracker
```
## Pricing Reference (2025-2026)
| Model | Input ($/1M tokens) | Output ($/1M tokens) | Relative Cost |
|-------|---------------------|----------------------|---------------|
| Haiku 4.5 | $0.80 | $4.00 | 1x |
| Sonnet 4.5 | $3.00 | $15.00 | ~4x |
| Opus 4.5 | $15.00 | $75.00 | ~19x |
## Best Practices
- **Start with the cheapest model** and only route to expensive models when complexity thresholds are met
- **Set explicit budget limits** before processing batches — fail early rather than overspend
- **Log model selection decisions** so you can tune thresholds based on real data
- **Use prompt caching** for system prompts over 1024 tokens — saves both cost and latency
- **Never retry on authentication or validation errors** — only transient failures (network, rate limit, server error)
## Anti-Patterns to Avoid
- Using the most expensive model for all requests regardless of complexity
- Retrying on all errors (wastes budget on permanent failures)
- Mutating cost tracking state (makes debugging and auditing difficult)
- Hardcoding model names throughout the codebase (use constants or config)
- Ignoring prompt caching for repetitive system prompts
## When to Use
- Any application calling Claude, OpenAI, or similar LLM APIs
- Batch processing pipelines where cost adds up quickly
- Multi-model architectures that need intelligent routing
- Production systems that need budget guardrails

View File

@@ -0,0 +1,722 @@
---
name: cpp-coding-standards
description: C++ coding standards based on the C++ Core Guidelines (isocpp.github.io). Use when writing, reviewing, or refactoring C++ code to enforce modern, safe, and idiomatic practices.
---
# C++ Coding Standards (C++ Core Guidelines)
Comprehensive coding standards for modern C++ (C++17/20/23) derived from the [C++ Core Guidelines](https://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines). Enforces type safety, resource safety, immutability, and clarity.
## When to Use
- Writing new C++ code (classes, functions, templates)
- Reviewing or refactoring existing C++ code
- Making architectural decisions in C++ projects
- Enforcing consistent style across a C++ codebase
- Choosing between language features (e.g., `enum` vs `enum class`, raw pointer vs smart pointer)
### When NOT to Use
- Non-C++ projects
- Legacy C codebases that cannot adopt modern C++ features
- Embedded/bare-metal contexts where specific guidelines conflict with hardware constraints (adapt selectively)
## Cross-Cutting Principles
These themes recur across the entire guidelines and form the foundation:
1. **RAII everywhere** (P.8, R.1, E.6, CP.20): Bind resource lifetime to object lifetime
2. **Immutability by default** (P.10, Con.1-5, ES.25): Start with `const`/`constexpr`; mutability is the exception
3. **Type safety** (P.4, I.4, ES.46-49, Enum.3): Use the type system to prevent errors at compile time
4. **Express intent** (P.3, F.1, NL.1-2, T.10): Names, types, and concepts should communicate purpose
5. **Minimize complexity** (F.2-3, ES.5, Per.4-5): Simple code is correct code
6. **Value semantics over pointer semantics** (C.10, R.3-5, F.20, CP.31): Prefer returning by value and scoped objects
## Philosophy & Interfaces (P.*, I.*)
### Key Rules
| Rule | Summary |
|------|---------|
| **P.1** | Express ideas directly in code |
| **P.3** | Express intent |
| **P.4** | Ideally, a program should be statically type safe |
| **P.5** | Prefer compile-time checking to run-time checking |
| **P.8** | Don't leak any resources |
| **P.10** | Prefer immutable data to mutable data |
| **I.1** | Make interfaces explicit |
| **I.2** | Avoid non-const global variables |
| **I.4** | Make interfaces precisely and strongly typed |
| **I.11** | Never transfer ownership by a raw pointer or reference |
| **I.23** | Keep the number of function arguments low |
### DO
```cpp
// P.10 + I.4: Immutable, strongly typed interface
struct Temperature {
double kelvin;
};
Temperature boil(const Temperature& water);
```
### DON'T
```cpp
// Weak interface: unclear ownership, unclear units
double boil(double* temp);
// Non-const global variable
int g_counter = 0; // I.2 violation
```
## Functions (F.*)
### Key Rules
| Rule | Summary |
|------|---------|
| **F.1** | Package meaningful operations as carefully named functions |
| **F.2** | A function should perform a single logical operation |
| **F.3** | Keep functions short and simple |
| **F.4** | If a function might be evaluated at compile time, declare it `constexpr` |
| **F.6** | If your function must not throw, declare it `noexcept` |
| **F.8** | Prefer pure functions |
| **F.16** | For "in" parameters, pass cheaply-copied types by value and others by `const&` |
| **F.20** | For "out" values, prefer return values to output parameters |
| **F.21** | To return multiple "out" values, prefer returning a struct |
| **F.43** | Never return a pointer or reference to a local object |
### Parameter Passing
```cpp
// F.16: Cheap types by value, others by const&
void print(int x); // cheap: by value
void analyze(const std::string& data); // expensive: by const&
void transform(std::string s); // sink: by value (will move)
// F.20 + F.21: Return values, not output parameters
struct ParseResult {
std::string token;
int position;
};
ParseResult parse(std::string_view input); // GOOD: return struct
// BAD: output parameters
void parse(std::string_view input,
std::string& token, int& pos); // avoid this
```
### Pure Functions and constexpr
```cpp
// F.4 + F.8: Pure, constexpr where possible
constexpr int factorial(int n) noexcept {
return (n <= 1) ? 1 : n * factorial(n - 1);
}
static_assert(factorial(5) == 120);
```
### Anti-Patterns
- Returning `T&&` from functions (F.45)
- Using `va_arg` / C-style variadics (F.55)
- Capturing by reference in lambdas passed to other threads (F.53)
- Returning `const T` which inhibits move semantics (F.49)
## Classes & Class Hierarchies (C.*)
### Key Rules
| Rule | Summary |
|------|---------|
| **C.2** | Use `class` if invariant exists; `struct` if data members vary independently |
| **C.9** | Minimize exposure of members |
| **C.20** | If you can avoid defining default operations, do (Rule of Zero) |
| **C.21** | If you define or `=delete` any copy/move/destructor, handle them all (Rule of Five) |
| **C.35** | Base class destructor: public virtual or protected non-virtual |
| **C.41** | A constructor should create a fully initialized object |
| **C.46** | Declare single-argument constructors `explicit` |
| **C.67** | A polymorphic class should suppress public copy/move |
| **C.128** | Virtual functions: specify exactly one of `virtual`, `override`, or `final` |
### Rule of Zero
```cpp
// C.20: Let the compiler generate special members
struct Employee {
std::string name;
std::string department;
int id;
// No destructor, copy/move constructors, or assignment operators needed
};
```
### Rule of Five
```cpp
// C.21: If you must manage a resource, define all five
class Buffer {
public:
explicit Buffer(std::size_t size)
: data_(std::make_unique<char[]>(size)), size_(size) {}
~Buffer() = default;
Buffer(const Buffer& other)
: data_(std::make_unique<char[]>(other.size_)), size_(other.size_) {
std::copy_n(other.data_.get(), size_, data_.get());
}
Buffer& operator=(const Buffer& other) {
if (this != &other) {
auto new_data = std::make_unique<char[]>(other.size_);
std::copy_n(other.data_.get(), other.size_, new_data.get());
data_ = std::move(new_data);
size_ = other.size_;
}
return *this;
}
Buffer(Buffer&&) noexcept = default;
Buffer& operator=(Buffer&&) noexcept = default;
private:
std::unique_ptr<char[]> data_;
std::size_t size_;
};
```
### Class Hierarchy
```cpp
// C.35 + C.128: Virtual destructor, use override
class Shape {
public:
virtual ~Shape() = default;
virtual double area() const = 0; // C.121: pure interface
};
class Circle : public Shape {
public:
explicit Circle(double r) : radius_(r) {}
double area() const override { return 3.14159 * radius_ * radius_; }
private:
double radius_;
};
```
### Anti-Patterns
- Calling virtual functions in constructors/destructors (C.82)
- Using `memset`/`memcpy` on non-trivial types (C.90)
- Providing different default arguments for virtual function and overrider (C.140)
- Making data members `const` or references, which suppresses move/copy (C.12)
## Resource Management (R.*)
### Key Rules
| Rule | Summary |
|------|---------|
| **R.1** | Manage resources automatically using RAII |
| **R.3** | A raw pointer (`T*`) is non-owning |
| **R.5** | Prefer scoped objects; don't heap-allocate unnecessarily |
| **R.10** | Avoid `malloc()`/`free()` |
| **R.11** | Avoid calling `new` and `delete` explicitly |
| **R.20** | Use `unique_ptr` or `shared_ptr` to represent ownership |
| **R.21** | Prefer `unique_ptr` over `shared_ptr` unless sharing ownership |
| **R.22** | Use `make_shared()` to make `shared_ptr`s |
### Smart Pointer Usage
```cpp
// R.11 + R.20 + R.21: RAII with smart pointers
auto widget = std::make_unique<Widget>("config"); // unique ownership
auto cache = std::make_shared<Cache>(1024); // shared ownership
// R.3: Raw pointer = non-owning observer
void render(const Widget* w) { // does NOT own w
if (w) w->draw();
}
render(widget.get());
```
### RAII Pattern
```cpp
// R.1: Resource acquisition is initialization
class FileHandle {
public:
explicit FileHandle(const std::string& path)
: handle_(std::fopen(path.c_str(), "r")) {
if (!handle_) throw std::runtime_error("Failed to open: " + path);
}
~FileHandle() {
if (handle_) std::fclose(handle_);
}
FileHandle(const FileHandle&) = delete;
FileHandle& operator=(const FileHandle&) = delete;
FileHandle(FileHandle&& other) noexcept
: handle_(std::exchange(other.handle_, nullptr)) {}
FileHandle& operator=(FileHandle&& other) noexcept {
if (this != &other) {
if (handle_) std::fclose(handle_);
handle_ = std::exchange(other.handle_, nullptr);
}
return *this;
}
private:
std::FILE* handle_;
};
```
### Anti-Patterns
- Naked `new`/`delete` (R.11)
- `malloc()`/`free()` in C++ code (R.10)
- Multiple resource allocations in a single expression (R.13 -- exception safety hazard)
- `shared_ptr` where `unique_ptr` suffices (R.21)
## Expressions & Statements (ES.*)
### Key Rules
| Rule | Summary |
|------|---------|
| **ES.5** | Keep scopes small |
| **ES.20** | Always initialize an object |
| **ES.23** | Prefer `{}` initializer syntax |
| **ES.25** | Declare objects `const` or `constexpr` unless modification is intended |
| **ES.28** | Use lambdas for complex initialization of `const` variables |
| **ES.45** | Avoid magic constants; use symbolic constants |
| **ES.46** | Avoid narrowing/lossy arithmetic conversions |
| **ES.47** | Use `nullptr` rather than `0` or `NULL` |
| **ES.48** | Avoid casts |
| **ES.50** | Don't cast away `const` |
### Initialization
```cpp
// ES.20 + ES.23 + ES.25: Always initialize, prefer {}, default to const
const int max_retries{3};
const std::string name{"widget"};
const std::vector<int> primes{2, 3, 5, 7, 11};
// ES.28: Lambda for complex const initialization
const auto config = [&] {
Config c;
c.timeout = std::chrono::seconds{30};
c.retries = max_retries;
c.verbose = debug_mode;
return c;
}();
```
### Anti-Patterns
- Uninitialized variables (ES.20)
- Using `0` or `NULL` as pointer (ES.47 -- use `nullptr`)
- C-style casts (ES.48 -- use `static_cast`, `const_cast`, etc.)
- Casting away `const` (ES.50)
- Magic numbers without named constants (ES.45)
- Mixing signed and unsigned arithmetic (ES.100)
- Reusing names in nested scopes (ES.12)
## Error Handling (E.*)
### Key Rules
| Rule | Summary |
|------|---------|
| **E.1** | Develop an error-handling strategy early in a design |
| **E.2** | Throw an exception to signal that a function can't perform its assigned task |
| **E.6** | Use RAII to prevent leaks |
| **E.12** | Use `noexcept` when throwing is impossible or unacceptable |
| **E.14** | Use purpose-designed user-defined types as exceptions |
| **E.15** | Throw by value, catch by reference |
| **E.16** | Destructors, deallocation, and swap must never fail |
| **E.17** | Don't try to catch every exception in every function |
### Exception Hierarchy
```cpp
// E.14 + E.15: Custom exception types, throw by value, catch by reference
class AppError : public std::runtime_error {
public:
using std::runtime_error::runtime_error;
};
class NetworkError : public AppError {
public:
NetworkError(const std::string& msg, int code)
: AppError(msg), status_code(code) {}
int status_code;
};
void fetch_data(const std::string& url) {
// E.2: Throw to signal failure
throw NetworkError("connection refused", 503);
}
void run() {
try {
fetch_data("https://api.example.com");
} catch (const NetworkError& e) {
log_error(e.what(), e.status_code);
} catch (const AppError& e) {
log_error(e.what());
}
// E.17: Don't catch everything here -- let unexpected errors propagate
}
```
### Anti-Patterns
- Throwing built-in types like `int` or string literals (E.14)
- Catching by value (slicing risk) (E.15)
- Empty catch blocks that silently swallow errors
- Using exceptions for flow control (E.3)
- Error handling based on global state like `errno` (E.28)
## Constants & Immutability (Con.*)
### All Rules
| Rule | Summary |
|------|---------|
| **Con.1** | By default, make objects immutable |
| **Con.2** | By default, make member functions `const` |
| **Con.3** | By default, pass pointers and references to `const` |
| **Con.4** | Use `const` for values that don't change after construction |
| **Con.5** | Use `constexpr` for values computable at compile time |
```cpp
// Con.1 through Con.5: Immutability by default
class Sensor {
public:
explicit Sensor(std::string id) : id_(std::move(id)) {}
// Con.2: const member functions by default
const std::string& id() const { return id_; }
double last_reading() const { return reading_; }
// Only non-const when mutation is required
void record(double value) { reading_ = value; }
private:
const std::string id_; // Con.4: never changes after construction
double reading_{0.0};
};
// Con.3: Pass by const reference
void display(const Sensor& s) {
std::cout << s.id() << ": " << s.last_reading() << '\n';
}
// Con.5: Compile-time constants
constexpr double PI = 3.14159265358979;
constexpr int MAX_SENSORS = 256;
```
## Concurrency & Parallelism (CP.*)
### Key Rules
| Rule | Summary |
|------|---------|
| **CP.2** | Avoid data races |
| **CP.3** | Minimize explicit sharing of writable data |
| **CP.4** | Think in terms of tasks, rather than threads |
| **CP.8** | Don't use `volatile` for synchronization |
| **CP.20** | Use RAII, never plain `lock()`/`unlock()` |
| **CP.21** | Use `std::scoped_lock` to acquire multiple mutexes |
| **CP.22** | Never call unknown code while holding a lock |
| **CP.42** | Don't wait without a condition |
| **CP.44** | Remember to name your `lock_guard`s and `unique_lock`s |
| **CP.100** | Don't use lock-free programming unless you absolutely have to |
### Safe Locking
```cpp
// CP.20 + CP.44: RAII locks, always named
class ThreadSafeQueue {
public:
void push(int value) {
std::lock_guard<std::mutex> lock(mutex_); // CP.44: named!
queue_.push(value);
cv_.notify_one();
}
int pop() {
std::unique_lock<std::mutex> lock(mutex_);
// CP.42: Always wait with a condition
cv_.wait(lock, [this] { return !queue_.empty(); });
const int value = queue_.front();
queue_.pop();
return value;
}
private:
std::mutex mutex_; // CP.50: mutex with its data
std::condition_variable cv_;
std::queue<int> queue_;
};
```
### Multiple Mutexes
```cpp
// CP.21: std::scoped_lock for multiple mutexes (deadlock-free)
void transfer(Account& from, Account& to, double amount) {
std::scoped_lock lock(from.mutex_, to.mutex_);
from.balance_ -= amount;
to.balance_ += amount;
}
```
### Anti-Patterns
- `volatile` for synchronization (CP.8 -- it's for hardware I/O only)
- Detaching threads (CP.26 -- lifetime management becomes nearly impossible)
- Unnamed lock guards: `std::lock_guard<std::mutex>(m);` destroys immediately (CP.44)
- Holding locks while calling callbacks (CP.22 -- deadlock risk)
- Lock-free programming without deep expertise (CP.100)
## Templates & Generic Programming (T.*)
### Key Rules
| Rule | Summary |
|------|---------|
| **T.1** | Use templates to raise the level of abstraction |
| **T.2** | Use templates to express algorithms for many argument types |
| **T.10** | Specify concepts for all template arguments |
| **T.11** | Use standard concepts whenever possible |
| **T.13** | Prefer shorthand notation for simple concepts |
| **T.43** | Prefer `using` over `typedef` |
| **T.120** | Use template metaprogramming only when you really need to |
| **T.144** | Don't specialize function templates (overload instead) |
### Concepts (C++20)
```cpp
#include <concepts>
// T.10 + T.11: Constrain templates with standard concepts
template<std::integral T>
T gcd(T a, T b) {
while (b != 0) {
a = std::exchange(b, a % b);
}
return a;
}
// T.13: Shorthand concept syntax
void sort(std::ranges::random_access_range auto& range) {
std::ranges::sort(range);
}
// Custom concept for domain-specific constraints
template<typename T>
concept Serializable = requires(const T& t) {
{ t.serialize() } -> std::convertible_to<std::string>;
};
template<Serializable T>
void save(const T& obj, const std::string& path);
```
### Anti-Patterns
- Unconstrained templates in visible namespaces (T.47)
- Specializing function templates instead of overloading (T.144)
- Template metaprogramming where `constexpr` suffices (T.120)
- `typedef` instead of `using` (T.43)
## Standard Library (SL.*)
### Key Rules
| Rule | Summary |
|------|---------|
| **SL.1** | Use libraries wherever possible |
| **SL.2** | Prefer the standard library to other libraries |
| **SL.con.1** | Prefer `std::array` or `std::vector` over C arrays |
| **SL.con.2** | Prefer `std::vector` by default |
| **SL.str.1** | Use `std::string` to own character sequences |
| **SL.str.2** | Use `std::string_view` to refer to character sequences |
| **SL.io.50** | Avoid `endl` (use `'\n'` -- `endl` forces a flush) |
```cpp
// SL.con.1 + SL.con.2: Prefer vector/array over C arrays
const std::array<int, 4> fixed_data{1, 2, 3, 4};
std::vector<std::string> dynamic_data;
// SL.str.1 + SL.str.2: string owns, string_view observes
std::string build_greeting(std::string_view name) {
return "Hello, " + std::string(name) + "!";
}
// SL.io.50: Use '\n' not endl
std::cout << "result: " << value << '\n';
```
## Enumerations (Enum.*)
### Key Rules
| Rule | Summary |
|------|---------|
| **Enum.1** | Prefer enumerations over macros |
| **Enum.3** | Prefer `enum class` over plain `enum` |
| **Enum.5** | Don't use ALL_CAPS for enumerators |
| **Enum.6** | Avoid unnamed enumerations |
```cpp
// Enum.3 + Enum.5: Scoped enum, no ALL_CAPS
enum class Color { red, green, blue };
enum class LogLevel { debug, info, warning, error };
// BAD: plain enum leaks names, ALL_CAPS clashes with macros
enum { RED, GREEN, BLUE }; // Enum.3 + Enum.5 + Enum.6 violation
#define MAX_SIZE 100 // Enum.1 violation -- use constexpr
```
## Source Files & Naming (SF.*, NL.*)
### Key Rules
| Rule | Summary |
|------|---------|
| **SF.1** | Use `.cpp` for code files and `.h` for interface files |
| **SF.7** | Don't write `using namespace` at global scope in a header |
| **SF.8** | Use `#include` guards for all `.h` files |
| **SF.11** | Header files should be self-contained |
| **NL.5** | Avoid encoding type information in names (no Hungarian notation) |
| **NL.8** | Use a consistent naming style |
| **NL.9** | Use ALL_CAPS for macro names only |
| **NL.10** | Prefer `underscore_style` names |
### Header Guard
```cpp
// SF.8: Include guard (or #pragma once)
#ifndef PROJECT_MODULE_WIDGET_H
#define PROJECT_MODULE_WIDGET_H
// SF.11: Self-contained -- include everything this header needs
#include <string>
#include <vector>
namespace project::module {
class Widget {
public:
explicit Widget(std::string name);
const std::string& name() const;
private:
std::string name_;
};
} // namespace project::module
#endif // PROJECT_MODULE_WIDGET_H
```
### Naming Conventions
```cpp
// NL.8 + NL.10: Consistent underscore_style
namespace my_project {
constexpr int max_buffer_size = 4096; // NL.9: not ALL_CAPS (it's not a macro)
class tcp_connection { // underscore_style class
public:
void send_message(std::string_view msg);
bool is_connected() const;
private:
std::string host_; // trailing underscore for members
int port_;
};
} // namespace my_project
```
### Anti-Patterns
- `using namespace std;` in a header at global scope (SF.7)
- Headers that depend on inclusion order (SF.10, SF.11)
- Hungarian notation like `strName`, `iCount` (NL.5)
- ALL_CAPS for anything other than macros (NL.9)
## Performance (Per.*)
### Key Rules
| Rule | Summary |
|------|---------|
| **Per.1** | Don't optimize without reason |
| **Per.2** | Don't optimize prematurely |
| **Per.6** | Don't make claims about performance without measurements |
| **Per.7** | Design to enable optimization |
| **Per.10** | Rely on the static type system |
| **Per.11** | Move computation from run time to compile time |
| **Per.19** | Access memory predictably |
### Guidelines
```cpp
// Per.11: Compile-time computation where possible
constexpr auto lookup_table = [] {
std::array<int, 256> table{};
for (int i = 0; i < 256; ++i) {
table[i] = i * i;
}
return table;
}();
// Per.19: Prefer contiguous data for cache-friendliness
std::vector<Point> points; // GOOD: contiguous
std::vector<std::unique_ptr<Point>> indirect_points; // BAD: pointer chasing
```
### Anti-Patterns
- Optimizing without profiling data (Per.1, Per.6)
- Choosing "clever" low-level code over clear abstractions (Per.4, Per.5)
- Ignoring data layout and cache behavior (Per.19)
## Quick Reference Checklist
Before marking C++ work complete:
- [ ] No raw `new`/`delete` -- use smart pointers or RAII (R.11)
- [ ] Objects initialized at declaration (ES.20)
- [ ] Variables are `const`/`constexpr` by default (Con.1, ES.25)
- [ ] Member functions are `const` where possible (Con.2)
- [ ] `enum class` instead of plain `enum` (Enum.3)
- [ ] `nullptr` instead of `0`/`NULL` (ES.47)
- [ ] No narrowing conversions (ES.46)
- [ ] No C-style casts (ES.48)
- [ ] Single-argument constructors are `explicit` (C.46)
- [ ] Rule of Zero or Rule of Five applied (C.20, C.21)
- [ ] Base class destructors are public virtual or protected non-virtual (C.35)
- [ ] Templates are constrained with concepts (T.10)
- [ ] No `using namespace` in headers at global scope (SF.7)
- [ ] Headers have include guards and are self-contained (SF.8, SF.11)
- [ ] Locks use RAII (`scoped_lock`/`lock_guard`) (CP.20)
- [ ] Exceptions are custom types, thrown by value, caught by reference (E.14, E.15)
- [ ] `'\n'` instead of `std::endl` (SL.io.50)
- [ ] No magic numbers (ES.45)

View File

@@ -0,0 +1,219 @@
---
name: regex-vs-llm-structured-text
description: Decision framework for choosing between regex and LLM when parsing structured text — start with regex, add LLM only for low-confidence edge cases.
---
# Regex vs LLM for Structured Text Parsing
A practical decision framework for parsing structured text (quizzes, forms, invoices, documents). The key insight: regex handles 95-98% of cases cheaply and deterministically. Reserve expensive LLM calls for the remaining edge cases.
## When to Activate
- Parsing structured text with repeating patterns (questions, forms, tables)
- Deciding between regex and LLM for text extraction
- Building hybrid pipelines that combine both approaches
- Optimizing cost/accuracy tradeoffs in text processing
## Decision Framework
```
Is the text format consistent and repeating?
├── Yes (>90% follows a pattern) → Start with Regex
│ ├── Regex handles 95%+ → Done, no LLM needed
│ └── Regex handles <95% → Add LLM for edge cases only
└── No (free-form, highly variable) → Use LLM directly
```
## Architecture Pattern
```
Source Text
[Regex Parser] ─── Extracts structure (95-98% accuracy)
[Text Cleaner] ─── Removes noise (markers, page numbers, artifacts)
[Confidence Scorer] ─── Flags low-confidence extractions
├── High confidence (≥0.95) → Direct output
└── Low confidence (<0.95) → [LLM Validator] → Output
```
## Implementation
### 1. Regex Parser (Handles the Majority)
```python
import re
from dataclasses import dataclass
@dataclass(frozen=True)
class ParsedItem:
id: str
text: str
choices: tuple[str, ...]
answer: str
confidence: float = 1.0
def parse_structured_text(content: str) -> list[ParsedItem]:
"""Parse structured text using regex patterns."""
pattern = re.compile(
r"(?P<id>\d+)\.\s*(?P<text>.+?)\n"
r"(?P<choices>(?:[A-D]\..+?\n)+)"
r"Answer:\s*(?P<answer>[A-D])",
re.MULTILINE | re.DOTALL,
)
items = []
for match in pattern.finditer(content):
choices = tuple(
c.strip() for c in re.findall(r"[A-D]\.\s*(.+)", match.group("choices"))
)
items.append(ParsedItem(
id=match.group("id"),
text=match.group("text").strip(),
choices=choices,
answer=match.group("answer"),
))
return items
```
### 2. Confidence Scoring
Flag items that may need LLM review:
```python
@dataclass(frozen=True)
class ConfidenceFlag:
item_id: str
score: float
reasons: tuple[str, ...]
def score_confidence(item: ParsedItem) -> ConfidenceFlag:
"""Score extraction confidence and flag issues."""
reasons = []
score = 1.0
if len(item.choices) < 3:
reasons.append("few_choices")
score -= 0.3
if not item.answer:
reasons.append("missing_answer")
score -= 0.5
if len(item.text) < 10:
reasons.append("short_text")
score -= 0.2
return ConfidenceFlag(
item_id=item.id,
score=max(0.0, score),
reasons=tuple(reasons),
)
def identify_low_confidence(
items: list[ParsedItem],
threshold: float = 0.95,
) -> list[ConfidenceFlag]:
"""Return items below confidence threshold."""
flags = [score_confidence(item) for item in items]
return [f for f in flags if f.score < threshold]
```
### 3. LLM Validator (Edge Cases Only)
```python
def validate_with_llm(
item: ParsedItem,
original_text: str,
client,
) -> ParsedItem:
"""Use LLM to fix low-confidence extractions."""
response = client.messages.create(
model="claude-haiku-4-5-20251001", # Cheapest model for validation
max_tokens=500,
messages=[{
"role": "user",
"content": (
f"Extract the question, choices, and answer from this text.\n\n"
f"Text: {original_text}\n\n"
f"Current extraction: {item}\n\n"
f"Return corrected JSON if needed, or 'CORRECT' if accurate."
),
}],
)
# Parse LLM response and return corrected item...
return corrected_item
```
### 4. Hybrid Pipeline
```python
def process_document(
content: str,
*,
llm_client=None,
confidence_threshold: float = 0.95,
) -> list[ParsedItem]:
"""Full pipeline: regex -> confidence check -> LLM for edge cases."""
# Step 1: Regex extraction (handles 95-98%)
items = parse_structured_text(content)
# Step 2: Confidence scoring
low_confidence = identify_low_confidence(items, confidence_threshold)
if not low_confidence or llm_client is None:
return items
# Step 3: LLM validation (only for flagged items)
low_conf_ids = {f.item_id for f in low_confidence}
result = []
for item in items:
if item.id in low_conf_ids:
result.append(validate_with_llm(item, content, llm_client))
else:
result.append(item)
return result
```
## Real-World Metrics
From a production quiz parsing pipeline (410 items):
| Metric | Value |
|--------|-------|
| Regex success rate | 98.0% |
| Low confidence items | 8 (2.0%) |
| LLM calls needed | ~5 |
| Cost savings vs all-LLM | ~95% |
| Test coverage | 93% |
## Best Practices
- **Start with regex** — even imperfect regex gives you a baseline to improve
- **Use confidence scoring** to programmatically identify what needs LLM help
- **Use the cheapest LLM** for validation (Haiku-class models are sufficient)
- **Never mutate** parsed items — return new instances from cleaning/validation steps
- **TDD works well** for parsers — write tests for known patterns first, then edge cases
- **Log metrics** (regex success rate, LLM call count) to track pipeline health
## Anti-Patterns to Avoid
- Sending all text to an LLM when regex handles 95%+ of cases (expensive and slow)
- Using regex for free-form, highly variable text (LLM is better here)
- Skipping confidence scoring and hoping regex "just works"
- Mutating parsed objects during cleaning/validation steps
- Not testing edge cases (malformed input, missing fields, encoding issues)
## When to Use
- Quiz/exam question parsing
- Form data extraction
- Invoice/receipt processing
- Document structure parsing (headers, sections, tables)
- Any structured text with repeating patterns where cost matters

View File

@@ -0,0 +1,142 @@
---
name: swift-actor-persistence
description: Thread-safe data persistence in Swift using actors — in-memory cache with file-backed storage, eliminating data races by design.
---
# Swift Actors for Thread-Safe Persistence
Patterns for building thread-safe data persistence layers using Swift actors. Combines in-memory caching with file-backed storage, leveraging the actor model to eliminate data races at compile time.
## When to Activate
- Building a data persistence layer in Swift 5.5+
- Need thread-safe access to shared mutable state
- Want to eliminate manual synchronization (locks, DispatchQueues)
- Building offline-first apps with local storage
## Core Pattern
### Actor-Based Repository
The actor model guarantees serialized access — no data races, enforced by the compiler.
```swift
public actor LocalRepository<T: Codable & Identifiable> where T.ID == String {
private var cache: [String: T] = [:]
private let fileURL: URL
public init(directory: URL = .documentsDirectory, filename: String = "data.json") {
self.fileURL = directory.appendingPathComponent(filename)
// Synchronous load during init (actor isolation not yet active)
self.cache = Self.loadSynchronously(from: fileURL)
}
// MARK: - Public API
public func save(_ item: T) throws {
cache[item.id] = item
try persistToFile()
}
public func delete(_ id: String) throws {
cache[id] = nil
try persistToFile()
}
public func find(by id: String) -> T? {
cache[id]
}
public func loadAll() -> [T] {
Array(cache.values)
}
// MARK: - Private
private func persistToFile() throws {
let data = try JSONEncoder().encode(Array(cache.values))
try data.write(to: fileURL, options: .atomic)
}
private static func loadSynchronously(from url: URL) -> [String: T] {
guard let data = try? Data(contentsOf: url),
let items = try? JSONDecoder().decode([T].self, from: data) else {
return [:]
}
return Dictionary(uniqueKeysWithValues: items.map { ($0.id, $0) })
}
}
```
### Usage
All calls are automatically async due to actor isolation:
```swift
let repository = LocalRepository<Question>()
// Read fast O(1) lookup from in-memory cache
let question = await repository.find(by: "q-001")
let allQuestions = await repository.loadAll()
// Write updates cache and persists to file atomically
try await repository.save(newQuestion)
try await repository.delete("q-001")
```
### Combining with @Observable ViewModel
```swift
@Observable
final class QuestionListViewModel {
private(set) var questions: [Question] = []
private let repository: LocalRepository<Question>
init(repository: LocalRepository<Question> = LocalRepository()) {
self.repository = repository
}
func load() async {
questions = await repository.loadAll()
}
func add(_ question: Question) async throws {
try await repository.save(question)
questions = await repository.loadAll()
}
}
```
## Key Design Decisions
| Decision | Rationale |
|----------|-----------|
| Actor (not class + lock) | Compiler-enforced thread safety, no manual synchronization |
| In-memory cache + file persistence | Fast reads from cache, durable writes to disk |
| Synchronous init loading | Avoids async initialization complexity |
| Dictionary keyed by ID | O(1) lookups by identifier |
| Generic over `Codable & Identifiable` | Reusable across any model type |
| Atomic file writes (`.atomic`) | Prevents partial writes on crash |
## Best Practices
- **Use `Sendable` types** for all data crossing actor boundaries
- **Keep the actor's public API minimal** — only expose domain operations, not persistence details
- **Use `.atomic` writes** to prevent data corruption if the app crashes mid-write
- **Load synchronously in `init`** — async initializers add complexity with minimal benefit for local files
- **Combine with `@Observable`** ViewModels for reactive UI updates
## Anti-Patterns to Avoid
- Using `DispatchQueue` or `NSLock` instead of actors for new Swift concurrency code
- Exposing the internal cache dictionary to external callers
- Making the file URL configurable without validation
- Forgetting that all actor method calls are `await` — callers must handle async context
- Using `nonisolated` to bypass actor isolation (defeats the purpose)
## When to Use
- Local data storage in iOS/macOS apps (user data, settings, cached content)
- Offline-first architectures that sync to a server later
- Any shared mutable state that multiple parts of the app access concurrently
- Replacing legacy `DispatchQueue`-based thread safety with modern Swift concurrency

View File

@@ -0,0 +1,189 @@
---
name: swift-protocol-di-testing
description: Protocol-based dependency injection for testable Swift code — mock file system, network, and external APIs using focused protocols and Swift Testing.
---
# Swift Protocol-Based Dependency Injection for Testing
Patterns for making Swift code testable by abstracting external dependencies (file system, network, iCloud) behind small, focused protocols. Enables deterministic tests without I/O.
## When to Activate
- Writing Swift code that accesses file system, network, or external APIs
- Need to test error handling paths without triggering real failures
- Building modules that work across environments (app, test, SwiftUI preview)
- Designing testable architecture with Swift concurrency (actors, Sendable)
## Core Pattern
### 1. Define Small, Focused Protocols
Each protocol handles exactly one external concern.
```swift
// File system access
public protocol FileSystemProviding: Sendable {
func containerURL(for purpose: Purpose) -> URL?
}
// File read/write operations
public protocol FileAccessorProviding: Sendable {
func read(from url: URL) throws -> Data
func write(_ data: Data, to url: URL) throws
func fileExists(at url: URL) -> Bool
}
// Bookmark storage (e.g., for sandboxed apps)
public protocol BookmarkStorageProviding: Sendable {
func saveBookmark(_ data: Data, for key: String) throws
func loadBookmark(for key: String) throws -> Data?
}
```
### 2. Create Default (Production) Implementations
```swift
public struct DefaultFileSystemProvider: FileSystemProviding {
public init() {}
public func containerURL(for purpose: Purpose) -> URL? {
FileManager.default.url(forUbiquityContainerIdentifier: nil)
}
}
public struct DefaultFileAccessor: FileAccessorProviding {
public init() {}
public func read(from url: URL) throws -> Data {
try Data(contentsOf: url)
}
public func write(_ data: Data, to url: URL) throws {
try data.write(to: url, options: .atomic)
}
public func fileExists(at url: URL) -> Bool {
FileManager.default.fileExists(atPath: url.path)
}
}
```
### 3. Create Mock Implementations for Testing
```swift
public final class MockFileAccessor: FileAccessorProviding, @unchecked Sendable {
public var files: [URL: Data] = [:]
public var readError: Error?
public var writeError: Error?
public init() {}
public func read(from url: URL) throws -> Data {
if let error = readError { throw error }
guard let data = files[url] else {
throw CocoaError(.fileReadNoSuchFile)
}
return data
}
public func write(_ data: Data, to url: URL) throws {
if let error = writeError { throw error }
files[url] = data
}
public func fileExists(at url: URL) -> Bool {
files[url] != nil
}
}
```
### 4. Inject Dependencies with Default Parameters
Production code uses defaults; tests inject mocks.
```swift
public actor SyncManager {
private let fileSystem: FileSystemProviding
private let fileAccessor: FileAccessorProviding
public init(
fileSystem: FileSystemProviding = DefaultFileSystemProvider(),
fileAccessor: FileAccessorProviding = DefaultFileAccessor()
) {
self.fileSystem = fileSystem
self.fileAccessor = fileAccessor
}
public func sync() async throws {
guard let containerURL = fileSystem.containerURL(for: .sync) else {
throw SyncError.containerNotAvailable
}
let data = try fileAccessor.read(
from: containerURL.appendingPathComponent("data.json")
)
// Process data...
}
}
```
### 5. Write Tests with Swift Testing
```swift
import Testing
@Test("Sync manager handles missing container")
func testMissingContainer() async {
let mockFileSystem = MockFileSystemProvider(containerURL: nil)
let manager = SyncManager(fileSystem: mockFileSystem)
await #expect(throws: SyncError.containerNotAvailable) {
try await manager.sync()
}
}
@Test("Sync manager reads data correctly")
func testReadData() async throws {
let mockFileAccessor = MockFileAccessor()
mockFileAccessor.files[testURL] = testData
let manager = SyncManager(fileAccessor: mockFileAccessor)
let result = try await manager.loadData()
#expect(result == expectedData)
}
@Test("Sync manager handles read errors gracefully")
func testReadError() async {
let mockFileAccessor = MockFileAccessor()
mockFileAccessor.readError = CocoaError(.fileReadCorruptFile)
let manager = SyncManager(fileAccessor: mockFileAccessor)
await #expect(throws: SyncError.self) {
try await manager.sync()
}
}
```
## Best Practices
- **Single Responsibility**: Each protocol should handle one concern — don't create "god protocols" with many methods
- **Sendable conformance**: Required when protocols are used across actor boundaries
- **Default parameters**: Let production code use real implementations by default; only tests need to specify mocks
- **Error simulation**: Design mocks with configurable error properties for testing failure paths
- **Only mock boundaries**: Mock external dependencies (file system, network, APIs), not internal types
## Anti-Patterns to Avoid
- Creating a single large protocol that covers all external access
- Mocking internal types that have no external dependencies
- Using `#if DEBUG` conditionals instead of proper dependency injection
- Forgetting `Sendable` conformance when used with actors
- Over-engineering: if a type has no external dependencies, it doesn't need a protocol
## When to Use
- Any Swift code that touches file system, network, or external APIs
- Testing error handling paths that are hard to trigger in real environments
- Building modules that need to work in app, test, and SwiftUI preview contexts
- Apps using Swift concurrency (actors, structured concurrency) that need testable architecture