diff --git a/README.instructions.md b/README.instructions.md index d1cb738a..58194f71 100644 --- a/README.instructions.md +++ b/README.instructions.md @@ -60,6 +60,7 @@ Team and project-specific instructions to enhance GitHub Copilot's behavior for | [Joyride User Scripts Project Assistant](instructions/joyride-user-project.instructions.md)
[![Install in VS Code](https://img.shields.io/badge/VS_Code-Install-0098FF?style=flat-square&logo=visualstudiocode&logoColor=white)](https://aka.ms/awesome-copilot/install/instructions?url=vscode%3Achat-instructions%2Finstall%3Furl%3Dhttps%3A%2F%2Fraw.githubusercontent.com%2Fgithub%2Fawesome-copilot%2Fmain%2Finstructions%2Fjoyride-user-project.instructions.md)
[![Install in VS Code Insiders](https://img.shields.io/badge/VS_Code_Insiders-Install-24bfa5?style=flat-square&logo=visualstudiocode&logoColor=white)](https://aka.ms/awesome-copilot/install/instructions?url=vscode-insiders%3Achat-instructions%2Finstall%3Furl%3Dhttps%3A%2F%2Fraw.githubusercontent.com%2Fgithub%2Fawesome-copilot%2Fmain%2Finstructions%2Fjoyride-user-project.instructions.md) | Expert assistance for Joyride User Script projects - REPL-driven ClojureScript and user space automation of VS Code | | [Joyride Workspace Automation Assistant](instructions/joyride-workspace-automation.instructions.md)
[![Install in VS Code](https://img.shields.io/badge/VS_Code-Install-0098FF?style=flat-square&logo=visualstudiocode&logoColor=white)](https://aka.ms/awesome-copilot/install/instructions?url=vscode%3Achat-instructions%2Finstall%3Furl%3Dhttps%3A%2F%2Fraw.githubusercontent.com%2Fgithub%2Fawesome-copilot%2Fmain%2Finstructions%2Fjoyride-workspace-automation.instructions.md)
[![Install in VS Code Insiders](https://img.shields.io/badge/VS_Code_Insiders-Install-24bfa5?style=flat-square&logo=visualstudiocode&logoColor=white)](https://aka.ms/awesome-copilot/install/instructions?url=vscode-insiders%3Achat-instructions%2Finstall%3Furl%3Dhttps%3A%2F%2Fraw.githubusercontent.com%2Fgithub%2Fawesome-copilot%2Fmain%2Finstructions%2Fjoyride-workspace-automation.instructions.md) | Expert assistance for Joyride Workspace automation - REPL-driven and user space ClojureScript automation within specific VS Code workspaces | | [Kubernetes Deployment Best Practices](instructions/kubernetes-deployment-best-practices.instructions.md)
[![Install in VS Code](https://img.shields.io/badge/VS_Code-Install-0098FF?style=flat-square&logo=visualstudiocode&logoColor=white)](https://aka.ms/awesome-copilot/install/instructions?url=vscode%3Achat-instructions%2Finstall%3Furl%3Dhttps%3A%2F%2Fraw.githubusercontent.com%2Fgithub%2Fawesome-copilot%2Fmain%2Finstructions%2Fkubernetes-deployment-best-practices.instructions.md)
[![Install in VS Code Insiders](https://img.shields.io/badge/VS_Code_Insiders-Install-24bfa5?style=flat-square&logo=visualstudiocode&logoColor=white)](https://aka.ms/awesome-copilot/install/instructions?url=vscode-insiders%3Achat-instructions%2Finstall%3Furl%3Dhttps%3A%2F%2Fraw.githubusercontent.com%2Fgithub%2Fawesome-copilot%2Fmain%2Finstructions%2Fkubernetes-deployment-best-practices.instructions.md) | Comprehensive best practices for deploying and managing applications on Kubernetes. Covers Pods, Deployments, Services, Ingress, ConfigMaps, Secrets, health checks, resource limits, scaling, and security contexts. | +| [LangChain Python Instructions](instructions/langchain-python.instructions.md)
[![Install in VS Code](https://img.shields.io/badge/VS_Code-Install-0098FF?style=flat-square&logo=visualstudiocode&logoColor=white)](https://aka.ms/awesome-copilot/install/instructions?url=vscode%3Achat-instructions%2Finstall%3Furl%3Dhttps%3A%2F%2Fraw.githubusercontent.com%2Fgithub%2Fawesome-copilot%2Fmain%2Finstructions%2Flangchain-python.instructions.md)
[![Install in VS Code Insiders](https://img.shields.io/badge/VS_Code_Insiders-Install-24bfa5?style=flat-square&logo=visualstudiocode&logoColor=white)](https://aka.ms/awesome-copilot/install/instructions?url=vscode-insiders%3Achat-instructions%2Finstall%3Furl%3Dhttps%3A%2F%2Fraw.githubusercontent.com%2Fgithub%2Fawesome-copilot%2Fmain%2Finstructions%2Flangchain-python.instructions.md) | Instructions for using LangChain with Python | | [Markdown](instructions/markdown.instructions.md)
[![Install in VS Code](https://img.shields.io/badge/VS_Code-Install-0098FF?style=flat-square&logo=visualstudiocode&logoColor=white)](https://aka.ms/awesome-copilot/install/instructions?url=vscode%3Achat-instructions%2Finstall%3Furl%3Dhttps%3A%2F%2Fraw.githubusercontent.com%2Fgithub%2Fawesome-copilot%2Fmain%2Finstructions%2Fmarkdown.instructions.md)
[![Install in VS Code Insiders](https://img.shields.io/badge/VS_Code_Insiders-Install-24bfa5?style=flat-square&logo=visualstudiocode&logoColor=white)](https://aka.ms/awesome-copilot/install/instructions?url=vscode-insiders%3Achat-instructions%2Finstall%3Furl%3Dhttps%3A%2F%2Fraw.githubusercontent.com%2Fgithub%2Fawesome-copilot%2Fmain%2Finstructions%2Fmarkdown.instructions.md) | Documentation and content creation standards | | [Memory Bank](instructions/memory-bank.instructions.md)
[![Install in VS Code](https://img.shields.io/badge/VS_Code-Install-0098FF?style=flat-square&logo=visualstudiocode&logoColor=white)](https://aka.ms/awesome-copilot/install/instructions?url=vscode%3Achat-instructions%2Finstall%3Furl%3Dhttps%3A%2F%2Fraw.githubusercontent.com%2Fgithub%2Fawesome-copilot%2Fmain%2Finstructions%2Fmemory-bank.instructions.md)
[![Install in VS Code Insiders](https://img.shields.io/badge/VS_Code_Insiders-Install-24bfa5?style=flat-square&logo=visualstudiocode&logoColor=white)](https://aka.ms/awesome-copilot/install/instructions?url=vscode-insiders%3Achat-instructions%2Finstall%3Furl%3Dhttps%3A%2F%2Fraw.githubusercontent.com%2Fgithub%2Fawesome-copilot%2Fmain%2Finstructions%2Fmemory-bank.instructions.md) | Bank specific coding standards and best practices | | [Microsoft 365 Declarative Agents Development Guidelines](instructions/declarative-agents-microsoft365.instructions.md)
[![Install in VS Code](https://img.shields.io/badge/VS_Code-Install-0098FF?style=flat-square&logo=visualstudiocode&logoColor=white)](https://aka.ms/awesome-copilot/install/instructions?url=vscode%3Achat-instructions%2Finstall%3Furl%3Dhttps%3A%2F%2Fraw.githubusercontent.com%2Fgithub%2Fawesome-copilot%2Fmain%2Finstructions%2Fdeclarative-agents-microsoft365.instructions.md)
[![Install in VS Code Insiders](https://img.shields.io/badge/VS_Code_Insiders-Install-24bfa5?style=flat-square&logo=visualstudiocode&logoColor=white)](https://aka.ms/awesome-copilot/install/instructions?url=vscode-insiders%3Achat-instructions%2Finstall%3Furl%3Dhttps%3A%2F%2Fraw.githubusercontent.com%2Fgithub%2Fawesome-copilot%2Fmain%2Finstructions%2Fdeclarative-agents-microsoft365.instructions.md) | Comprehensive development guidelines for Microsoft 365 Copilot declarative agents with schema v1.5, TypeSpec integration, and Microsoft 365 Agents Toolkit workflows | diff --git a/instructions/langchain-python.instructions.md b/instructions/langchain-python.instructions.md new file mode 100644 index 00000000..06001eb5 --- /dev/null +++ b/instructions/langchain-python.instructions.md @@ -0,0 +1,229 @@ +--- +description: 'Instructions for using LangChain with Python' +applyTo: "**/*.py" +--- + +# LangChain Python Instructions + +These instructions guide GitHub Copilot in generating code and documentation for LangChain applications in Python. Focus on LangChain-specific patterns, APIs, and best practices. + +## Runnable Interface (LangChain-specific) + +LangChain's `Runnable` interface is the foundation for composing and executing chains, chat models, output parsers, retrievers, and LangGraph graphs. It provides a unified API for invoking, batching, streaming, inspecting, and composing components. + +**Key LangChain-specific features:** + +- All major LangChain components (chat models, output parsers, retrievers, graphs) implement the Runnable interface. +- Supports synchronous (`invoke`, `batch`, `stream`) and asynchronous (`ainvoke`, `abatch`, `astream`) execution. +- Batching (`batch`, `batch_as_completed`) is optimized for parallel API calls; set `max_concurrency` in `RunnableConfig` to control parallelism. +- Streaming APIs (`stream`, `astream`, `astream_events`) yield outputs as they are produced, critical for responsive LLM apps. +- Input/output types are component-specific (e.g., chat models accept messages, retrievers accept strings, output parsers accept model outputs). +- Inspect schemas with `get_input_schema`, `get_output_schema`, and their JSONSchema variants for validation and OpenAPI generation. +- Use `with_types` to override inferred input/output types for complex LCEL chains. +- Compose Runnables declaratively with LCEL: `chain = prompt | chat_model | output_parser`. +- Propagate `RunnableConfig` (tags, metadata, callbacks, concurrency) automatically in Python 3.11+; manually in async code for Python 3.9/3.10. +- Create custom runnables with `RunnableLambda` (simple transforms) or `RunnableGenerator` (streaming transforms); avoid subclassing directly. +- Configure runtime attributes and alternatives with `configurable_fields` and `configurable_alternatives` for dynamic chains and LangServe deployments. + +**LangChain best practices:** + +- Use batching for parallel API calls to LLMs or retrievers; set `max_concurrency` to avoid rate limits. +- Prefer streaming APIs for chat UIs and long outputs. +- Always validate input/output schemas for custom chains and deployed endpoints. +- Use tags and metadata in `RunnableConfig` for tracing in LangSmith and debugging complex chains. +- For custom logic, wrap functions with `RunnableLambda` or `RunnableGenerator` instead of subclassing. +- For advanced configuration, expose fields and alternatives via `configurable_fields` and `configurable_alternatives`. + + +- Use LangChain's chat model integrations for conversational AI: + +- Import from `langchain.chat_models` or `langchain_openai` (e.g., `ChatOpenAI`). +- Compose messages using `SystemMessage`, `HumanMessage`, `AIMessage`. +- For tool calling, use `bind_tools(tools)` method. +- For structured outputs, use `with_structured_output(schema)`. + +Example: +```python +from langchain_openai import ChatOpenAI +from langchain.schema import HumanMessage, SystemMessage + +chat = ChatOpenAI(model="gpt-4", temperature=0) +messages = [ + SystemMessage(content="You are a helpful assistant."), + HumanMessage(content="What is LangChain?") +] +response = chat.invoke(messages) +print(response.content) +``` + +- Compose messages as a list of `SystemMessage`, `HumanMessage`, and optionally `AIMessage` objects. +- For RAG, combine chat models with retrievers/vectorstores for context injection. +- Use `streaming=True` for real-time token streaming (if supported). +- Use `tools` argument for function/tool calling (OpenAI, Anthropic, etc.). +- Use `response_format="json"` for structured outputs (OpenAI models). + +Best practices: + +- Always validate model outputs before using them in downstream tasks. +- Prefer explicit message types for clarity and reliability. +- For Copilot, provide clear, actionable prompts and document expected outputs. + + + +- LLM client factory: centralize provider configs (API keys), timeouts, retries, and telemetry. Provide a single place to switch providers or client settings. +- Prompt templates: store templates under `prompts/` and load via a safe helper. Keep templates small and testable. +- Chains vs Agents: prefer Chains for deterministic pipelines (RAG, summarization). Use Agents when you require planning or dynamic tool selection. +- Tools: implement typed adapter interfaces for tools; validate inputs and outputs strictly. +- Memory: default to stateless design. When memory is needed, store minimal context and document retention/erasure policies. +- Retrievers: build retrieval + rerank pipelines. Keep vectorstore schema stable (id, text, metadata). + +### Patterns + +- Callbacks & tracing: use LangChain callbacks and integrate with LangSmith or your tracing system to capture request/response lifecycle. +- Separation of concerns: keep prompt construction, LLM wiring, and business logic separate to simplify testing and reduce accidental prompt changes. + +## Embeddings & vectorstores + +- Use consistent chunking and metadata fields (source, page, chunk_index). +- Cache embeddings to avoid repeated cost for unchanged documents. +- Local/dev: Chroma or FAISS. Production: managed vector DBs (Pinecone, Qdrant, Milvus, Weaviate) depending on scale and SLAs. + +## Vector stores (LangChain-specific) + +- Use LangChain's vectorstore integrations for semantic search, retrieval-augmented generation (RAG), and document similarity workflows. +- Always initialize vectorstores with a supported embedding model (e.g., OpenAIEmbeddings, HuggingFaceEmbeddings). +- Prefer official integrations (e.g., Chroma, FAISS, Pinecone, Qdrant, Weaviate) for production; use InMemoryVectorStore for tests and demos. +- Store documents as LangChain `Document` objects with `page_content` and `metadata`. +- Use `add_documents(documents, ids=...)` to add/update documents. Always provide unique IDs for upserts. +- Use `delete(ids=...)` to remove documents by ID. +- Use `similarity_search(query, k=4, filter={...})` to retrieve top-k similar documents. Use metadata filters for scoped search. +- For RAG, connect your vectorstore to a retriever and chain with an LLM (see LangChain Retriever and RAGChain docs). +- For advanced search, use vectorstore-specific options: Pinecone supports hybrid search and metadata filtering; Chroma supports filtering and custom distance metrics. +- Always validate the vectorstore integration and API version in your environment; breaking changes are common between LangChain releases. +- Example (InMemoryVectorStore): + +```python +from langchain_core.vectorstores import InMemoryVectorStore +from langchain_openai import OpenAIEmbeddings +from langchain_core.documents import Document + +embedding_model = OpenAIEmbeddings() +vector_store = InMemoryVectorStore(embedding=embedding_model) + +documents = [Document(page_content="LangChain content", metadata={"source": "doc1"})] +vector_store.add_documents(documents=documents, ids=["doc1"]) + +results = vector_store.similarity_search("What is RAG?", k=2) +for doc in results: + print(doc.page_content, doc.metadata) +``` + +- For production, prefer persistent vectorstores (Chroma, Pinecone, Qdrant, Weaviate) and configure authentication, scaling, and backup as per provider docs. +- Reference: https://python.langchain.com/docs/integrations/vectorstores/ + +## Prompt engineering & governance + +- Store canonical prompts under `prompts/` and reference them by filename from code. +- Write unit tests that assert required placeholders exist and that rendered prompts fit expected patterns (length, variables present). +- Maintain a CHANGELOG for prompt and schema changes that affect behavior. + +## Chat models + +LangChain offers a consistent interface for chat models with additional features for monitoring, debugging, and optimization. + +### Integrations + +Integrations are either: + +1. Official: packaged `langchain-` integrations maintained by the LangChain team or provider. +2. Community: contributed integrations (in `langchain-community`). + +Chat models typically follow a naming convention with a `Chat` prefix (e.g., `ChatOpenAI`, `ChatAnthropic`, `ChatOllama`). Models without the `Chat` prefix (or with an `LLM` suffix) often implement the older string-in/string-out interface and are less preferred for modern chat workflows. + +### Interface + +Chat models implement `BaseChatModel` and support the Runnable interface: streaming, async, batching, and more. Many operations accept and return LangChain `messages` (roles like `system`, `user`, `assistant`). See the BaseChatModel API reference for details. + +Key methods include: + +- `invoke(messages, ...)` — send a list of messages and receive a response. +- `stream(messages, ...)` — stream partial outputs as tokens arrive. +- `batch(inputs, ...)` — batch multiple requests. +- `bind_tools(tools)` — attach tool adapters for tool calling. +- `with_structured_output(schema)` — helper to request structured responses. + +### Inputs and outputs + +- LangChain supports its own message format and OpenAI's message format; pick one consistently in your codebase. +- Messages include a `role` and `content` blocks; content can include structured or multimodal payloads where supported. + +### Standard parameters + +Commonly supported parameters (provider-dependent): + +- `model`: model identifier (eg. `gpt-4o`, `gpt-3.5-turbo`). +- `temperature`: randomness control (0.0 deterministic — 1.0 creative). +- `timeout`: seconds to wait before canceling. +- `max_tokens`: response token limit. +- `stop`: stop sequences. +- `max_retries`: retry attempts for network/limit failures. +- `api_key`, `base_url`: provider auth and endpoint configuration. +- `rate_limiter`: optional BaseRateLimiter to space requests and avoid provider quota errors. + +> Note: Not all parameters are implemented by every provider. Always consult the provider integration docs. + +### Tool calling + +Chat models can call tools (APIs, DBs, system adapters). Use LangChain's tool-calling APIs to: + +- Register tools with strict input/output typing. +- Observe and log tool call requests and results. +- Validate tool outputs before passing them back to the model or executing side effects. + +See the tool-calling guide in the LangChain docs for examples and safe patterns. + +### Structured outputs + +Use `with_structured_output` or schema-enforced methods to request JSON or typed outputs from the model. Structured outputs are essential for reliable extraction and downstream processing (parsers, DB writes, analytics). + +### Multimodality + +Some models support multimodal inputs (images, audio). Check provider docs for supported input types and limitations. Multimodal outputs are rare — treat them as experimental and validate rigorously. + +### Context window + +Models have a finite context window measured in tokens. When designing conversational flows: + +- Keep messages concise and prioritize important context. +- Trim old context (summarize or archive) outside the model when it exceeds the window. +- Use a retriever + RAG pattern to surface relevant long-form context instead of pasting large documents into the chat. + +## Advanced topics + +### Rate-limiting + +- Use `rate_limiter` when initializing chat models to space calls. +- Implement retry with exponential backoff and consider fallback models or degraded modes when throttled. + +### Caching + +- Exact-input caching for conversations is often ineffective. Consider semantic caching (embedding-based) for repeated meaning-level queries. +- Semantic caching introduces dependency on embeddings and is not universally suitable. +- Cache only where it reduces cost and meets correctness requirements (e.g., FAQ bots). + +## Best practices + +- Use type hints and dataclasses for public APIs. +- Validate inputs before calling LLMs or tools. +- Load secrets from secret managers; never log secrets or unredacted model outputs. +- Deterministic tests: mock LLMs and embedding calls. +- Cache embeddings and frequent retrieval results. +- Observability: log request_id, model name, latency, and sanitized token counts. +- Implement exponential backoff and idempotency for external calls. + +## Security & privacy + +- Treat model outputs as untrusted. Sanitize before executing generated code or system commands. +- Validate any user-supplied URLs and inputs to avoid SSRF and injection attacks. +- Document data retention and add an API to erase user data on request. +- Limit stored PII and encrypt sensitive fields at rest.