> ## Documentation Index
> Fetch the complete documentation index at: https://stagehand-stg-1784.mintlify.site/llms.txt
> Use this file to discover all available pages before exploring further.

# extract()

> Complete API reference for the extract() method

<CardGroup cols={1}>
  <Card title="Extract" icon="ufo-beam" href="/v2/basics/extract">
    See how to use extract() to extract structured data from web pages
  </Card>
</CardGroup>

### Method Signatures

<Tabs>
  <Tab title="TypeScript">
    ```typescript theme={null}
    // With schema and options
    await page.extract<T extends z.AnyZodObject>(options: ExtractOptions<T>): Promise<ExtractResult<T>>

    // String instruction only
    await page.extract(instruction: string): Promise<{ extraction: string }>

    // No parameters (raw page content)
    await page.extract(): Promise<{ pageText: string }>
    ```

    **ExtractOptions Interface:**

    ```typescript theme={null}
    interface ExtractOptions<T extends z.AnyZodObject> {
      instruction?: string;
      schema?: T;
      modelName?: AvailableModel;
      modelClientOptions?: ClientOptions;
      domSettleTimeoutMs?: number;
      selector?: string;
      iframes?: boolean;
    }

    type ExtractResult<T> = z.infer<T>;
    ```
  </Tab>

  <Tab title="Python">
    ```python theme={null}
    # With schema and parameters
    await page.extract(
        instruction: str = None,
        schema: BaseModel = None,
        selector: str = None,
        iframes: bool = None,
        model_name: AvailableModel = None,
        model_client_options: Dict = None,
        dom_settle_timeout_ms: int = None
    ) -> ExtractResult

    # String instruction only
    await page.extract(instruction: str) -> Dict[str, str]

    # No parameters (raw page content)
    await page.extract() -> Dict[str, str]
    ```
  </Tab>
</Tabs>

### Parameters

<ParamField path="instruction" type="string" optional>
  Natural language description of what data to extract.
</ParamField>

<ParamField path="schema" type="z.ZodSchema | BaseModel" optional>
  Type schema defining the structure of data to extract. Ensures type safety and validation.
</ParamField>

<ParamField path="selector" type="string" optional>
  XPath selector to limit extraction scope. Reduces token usage and improves accuracy.
</ParamField>

<ParamField path="iframes" type="boolean" optional>
  Set to `true` if content exists within iframes.

  **Default:** `false`
</ParamField>

<ParamField path="modelName" type="AvailableModel" optional>
  Override the default LLM model for this extraction.
</ParamField>

<ParamField path="modelClientOptions" type="ClientOptions" optional>
  Model-specific configuration options.
</ParamField>

<ParamField path="domSettleTimeoutMs" type="number" optional>
  Maximum time to wait for DOM to stabilize.

  **Default:** `30000`
</ParamField>

### Response Types

<Tabs>
  <Tab title="With Schema">
    **Returns:** `Promise<ExtractResult<T>>` where T matches your schema

    The returned object will be strictly typed according to your schema definition.
  </Tab>

  <Tab title="String Only">
    **Returns:** `Promise<{ extraction: string }>`

    Simple string extraction without schema validation.
  </Tab>

  <Tab title="No Parameters">
    **Returns:** `Promise<{ pageText: string }>`

    Raw accessibility tree representation of page content.
  </Tab>
</Tabs>

### Code Examples

<Tabs>
  <Tab title="Single Object">
    <CodeGroup>
      ```typescript TypeScript theme={null}
      import { z } from 'zod';

      // Schema definition
      const ProductSchema = z.object({
        name: z.string(),
        price: z.number(),
        inStock: z.boolean()
      });

      // Extraction
      const product = await page.extract({
        instruction: "extract product details",
        schema: ProductSchema
      });
      ```

      ```python Python theme={null}
      from pydantic import BaseModel

      # Schema definition
      class Product(BaseModel):
          name: str
          price: float
          in_stock: bool

      # Extraction
      product = await page.extract(
          instruction="extract product details",
          schema=Product
      )
      ```
    </CodeGroup>

    #### Example Response

    ```json theme={null}
    {
      "name": "Product Name",
      "price": 100,
      "inStock": true
    }
    ```
  </Tab>

  <Tab title="Arrays">
    <CodeGroup>
      ```typescript TypeScript theme={null}
      import { z } from 'zod';

      // Schema definition
      const ApartmentListingsSchema = z.object({
        apartments: z.array(z.object({
          address: z.string(),
          price: z.string(),
          bedrooms: z.number()
        }))
      });

      // Extraction
      const listings = await page.extract({
        instruction: "extract all apartment listings", 
        schema: ApartmentListingsSchema
      });
      ```

      ```python Python theme={null}
      from pydantic import BaseModel
      from typing import List

      # Schema definition
      class Apartment(BaseModel):
          address: str
          price: str
          bedrooms: int

      class ApartmentListings(BaseModel):
          apartments: List[Apartment]

      # Extraction
      listings = await page.extract(
          instruction="extract all apartment listings",
          schema=ApartmentListings
      )
      ```
    </CodeGroup>

    #### Example Response

    ```json theme={null}
    {
      "apartments": [
        {
          "address": "123 Main St",
          "price": "$100,000",
          "bedrooms": 3
        },
        {
          "address": "456 Elm St",
          "price": "$150,000",
          "bedrooms": 2
        }
      ]
    }
    ```
  </Tab>

  <Tab title="URLs">
    <CodeGroup>
      ```typescript TypeScript theme={null}
      import { z } from 'zod';

      // Schema definition
      const NavigationSchema = z.object({
        links: z.array(z.object({
          text: z.string(),
          url: z.string().url()  // URL validation
        }))
      });

      // Extraction
      const links = await page.extract({
        instruction: "extract navigation links",
        schema: NavigationSchema
      });
      ```

      ```python Python theme={null}
      from pydantic import BaseModel, HttpUrl
      from typing import List

      # Schema definition
      class NavLink(BaseModel):
          text: str
          url: HttpUrl  # URL validation

      class Navigation(BaseModel):
          links: List[NavLink]

      # Extraction
      links = await page.extract(
          instruction="extract navigation links", 
          schema=Navigation
      )
      ```
    </CodeGroup>

    #### Example Response

    ```json theme={null}
    {
      "links": [
        {
          "text": "Home",
          "url": "https://example.com"
        }
      ]
    }
    ```
  </Tab>

  <Tab title="Scoped">
    <CodeGroup>
      ```typescript TypeScript theme={null}
      import { z } from 'zod';

      const ProductSchema = z.object({
        name: z.string(),
        price: z.number(),
        description: z.string()
      });

      // Extract from specific page section
      const data = await page.extract({
        instruction: "extract product info from this section",
        selector: "xpath=/html/body/div/div",
        schema: ProductSchema
      });
      ```

      ```python Python theme={null}
      from pydantic import BaseModel

      class Product(BaseModel):
          name: str
          price: float
          description: str

      # Extract from specific page section
      data = await page.extract(
          instruction="extract product info from this section",
          selector="xpath=/html/body/div/div",
          schema=Product
      )
      ```
    </CodeGroup>

    #### Example Response

    ```json theme={null}
    {
      "name": "Product Name",
      "price": 100,
      "description": "Product description"
    }
    ```
  </Tab>

  <Tab title="Schema-less">
    <CodeGroup>
      ```typescript TypeScript theme={null}
      // String only extraction
      const title = await page.extract("get the page title");
      // Returns: { extraction: "Page Title" }

      // Raw page content
      const content = await page.extract();
      // Returns: { pageText: "Accessibility Tree: ..." }
      ```

      ```python Python theme={null}
      # String only extraction
      title = await page.extract("get the page title")
      # Returns: {"extraction": "Page Title"}

      # Raw page content
      content = await page.extract()
      # Returns: {"pageText": "Accessibility Tree: ..."}
      ```
    </CodeGroup>

    #### Example Response

    ```json theme={null}
    {
      "extraction": "Page Title"
    }
    ```
  </Tab>

  <Tab title="Advanced">
    <CodeGroup>
      ```typescript TypeScript theme={null}
      import { z } from 'zod';

      // Schema with descriptions and validation
      const ProductSchema = z.object({
        price: z.number().describe("Product price in USD"),
        rating: z.number().min(0).max(5).describe("Customer rating out of 5"),
        available: z.boolean().describe("Whether product is in stock"),
        tags: z.array(z.string()).optional()
      });

      // Nested schema
      const EcommerceSchema = z.object({
        product: z.object({
          name: z.string(),
          price: z.object({
            current: z.number(),
            original: z.number().optional()
          })
        }),
        reviews: z.array(z.object({
          rating: z.number(),
          comment: z.string()
        }))
      });
      ```

      ```python Python theme={null}
      from pydantic import BaseModel, Field
      from typing import Optional, List

      # Schema with descriptions and validation
      class Product(BaseModel):
          price: float = Field(description="Product price in USD")
          rating: float = Field(ge=0, le=5, description="Customer rating out of 5")
          available: bool = Field(description="Whether product is in stock")
          tags: Optional[List[str]] = None

      # Nested schema
      class Price(BaseModel):
          current: float
          original: Optional[float] = None

      class Review(BaseModel):
          rating: int
          comment: str

      class ProductDetails(BaseModel):
          name: str
          price: Price

      class EcommerceData(BaseModel):
          product: ProductDetails
          reviews: List[Review]
      ```
    </CodeGroup>

    #### Example Response

    ```json theme={null}
    {
      "product": {
        "name": "Product Name",
        "price": {
          "current": 100,
          "original": 120
        }
      },
      "reviews": [
        {
          "rating": 4,
          "comment": "Great product!"
        }
      ]
    }
    ```
  </Tab>
</Tabs>
