This actor runs an Instagram post scraper and exposes two kinds of outputs: a structured dataset of scraped items and a key-value store entry named OUTPUT. Each invocation creates a Run (with an ID and status) and then writes results into the actor's dataset and key-value store according to the actor's internal logic. The three operations available map to three practical workflows: start-and-wait-for-key-value, start-and-wait-for-dataset, and start-and-return-run-metadata.
How the domain is organized
- Actor run: a single execution of the scraper. A run has metadata (ID, status, timestamps). The
runs-sync-apify-instagram-post-scrapercall returns that metadata immediately. - Dataset: the actor writes structured, itemized results (one scraped post per item) into a dataset. The
run-sync-get-dataset-items-apify-instagram-post-scrapercall executes the actor and returns those dataset items in the response. - Key-value store (key
OUTPUT): the actor also writes to the key-value store; the keyOUTPUTholds a blob the actor considers the primary output (often combined or raw data). Therun-sync-apify-instagram-post-scrapercall executes the actor and returns theOUTPUTvalue.
These three resources relate as: input -> run -> {dataset items, key-value OUTPUT}.
Entry points — which operation to call first
Choose based on what the user expects back and whether you want the call to wait for completion:
-
If the user wants the run ID and run metadata immediately (start now, check later), call
runs-sync-apify-instagram-post-scraper. It returnsRunsResponseSchemacontaining the run identifier and status. -
If the user wants the actor to complete and needs the structured scraped items (ready-to-use JSON posts), call
run-sync-get-dataset-items-apify-instagram-post-scraper. The call waits for completion and returns the dataset items directly in the response body. -
If the user wants whatever the actor places into the key-value
OUTPUT(often an aggregated or raw output) and expects the call to wait until that value is available, callrun-sync-apify-instagram-post-scraper. It waits for completion and returns theOUTPUTvalue.
All calls require a valid token and a body that conforms to the actor's InputSchema (set targets, limits, and other scraping options supported by the actor).
Common user tasks and the exact workflow to run
- "Scrape N recent posts for this username and return the posts as JSON"
- Action: call
run-sync-get-dataset-items-apify-instagram-post-scraperwithbodyfields that select the username and set the limit (e.g., max items or since/until). The response body will contain dataset items representing posts.
- "Run the scraper and return raw/combined output (HTML or bundle)"
- Action: call
run-sync-apify-instagram-post-scraper. The response body contains the actor'sOUTPUTkey-value content; use this when the actor bundles or post-processes results before writing them out.
- "Start a scraping run now and give me the run ID so I can track it later"
- Action: call
runs-sync-apify-instagram-post-scraper. Use the returned run metadata (run ID, status, timestamps) for reporting. If later output retrieval is required, prefer the synchronous run operations when the user asks for the actual data; otherwise, the run ID documents which execution produced the results.
- "Compare two runs or report run status"
- Action: start both runs with
runs-sync-apify-instagram-post-scraper(or save the run IDs from earlier calls). The run metadata contains status and timing useful for comparison. The dataset orOUTPUTfrom the synchronous run calls are the actual scraped results to compare.
For every task that requires specific input options (profile vs hashtag vs URL, maximum items, date range, include/exclude comments), read the actor's InputSchema fields and set those fields in body accordingly. The actor enforces what selectors and limits exist; set only supported fields.
Non-obvious patterns and gotchas
-
Naming confusion: the API exposes three similarly named operations. Remember the difference by return type and whether they block for outputs:
runs-sync-*= returns run metadata immediately (use when you only need the run ID/status).run-sync-get-dataset-items-*= waits and returns the dataset items (structured posts).run-sync-*= waits and returns the key-valueOUTPUT(actor-defined primary output).
-
OUTPUTvs dataset: the actor may put different content in the dataset and inOUTPUT. Don’t assume they are identical—choose the operation that returns the form you actually need. -
Long runs and waiting: the two
run-sync-*operations wait for the actor to finish. If the actor will scrape large volumes or run long, waiting may delay the response. If the user only needs confirmation that the run started, useruns-sync-apify-instagram-post-scraperand return the run ID. -
Inspect the schemas before calling: the actor's
InputSchemadefines exactly how to specify targets (username, hashtag, start URLs) and limits. TheRunsResponseSchemashows which run fields (ID, status, timestamps) will be returned. Read those schemas to know whichbodyfields to set and what to expect in the response bodies. -
Output formats vary by actor configuration: dataset item structure (which fields are present for each post) and the shape of
OUTPUTdepend on the actor’s internal mapping. Do not assume field names in the returned items—examine the actual response and map fields to user-facing output. -
Token is required: every operation accepts a
tokenargument; provide a valid token in all calls.
Quick decision guide
- Need structured posts right now ->
run-sync-get-dataset-items-apify-instagram-post-scraper. - Need raw or aggregated actor output (single blob) ->
run-sync-apify-instagram-post-scraper. - Only need run ID/status (don’t wait for completion) ->
runs-sync-apify-instagram-post-scraper.
Final notes
When preparing body, set the selection criteria (username, hashtag, URL), result limits, and any filtering options the InputSchema exposes. Use the run metadata returned by runs-sync-apify-instagram-post-scraper when you need to reference or document which execution produced results. Choose the synchronous dataset or key-value callers when the user expects immediate data in the response.