User Manual

This public manual documents how LLM tester with llama.cpp works with the current Android implementation.

Related docs: Technical Specification | Privacy Policy

Important: Download Data Usage

Downloading models may require gigabytes of data. Using mobile/cellular data may incur significant charges; downloading over Wi-Fi is strongly recommended.

1. Overview

2. Recommended Setup

  1. If the API/WebUI enablement popup appears at launch, enable it when needed or check "Don't show next time" to skip it on future launches.
  2. On first launch, if you check "Don't show next time" in Quick Start, it will not be shown on subsequent launches.
  3. Open "Settings" from the main screen.
    * During inference (Busy), the Settings button is disabled and is re-enabled automatically when processing completes.
  4. Enter the model URL or import a .gguf file from the local device, then tap "Load Model".
    * The local import picker opens in Downloads by default, and you can navigate elsewhere on the device as needed. Reachable HTTP/HTTPS URLs can be used. HTTPS uses normal SSL/TLS certificate verification. Imported local files are saved as filenames only in Settings.
  5. Edit parameters if needed and tap "Save Config".
  6. Tap "SAVE & CLOSE" to save settings and apply them to the model immediately.

3. Main Screen Features

4. Settings Screen

5. Model Parameter Details

Basic Parameters

Penalty Parameters

Mirostat Parameters

Additional Sampling Parameters

DRY Parameters

Output Settings

Think Settings

6. Prompt Template Auto-Selection

When no custom template is set, the app first estimates the family from GGUF chat_template metadata and otherwise auto-selects from the filename.

7. Stop Sequences

Generation automatically stops when common chat template delimiters are detected in the output.

8. API/WebUI Server (Optional)

9. 🧭 Finding GGUF Files

9-1. Locating GGUF-compatible models

9-2. Choosing a quantization variant (overview)

10. 📥 Downloading from a Browser

10-1. Manual download

  1. Open the model page
    Example: https://huggingface.co/unsloth/Mistral-Small-3.1-24B-Instruct-2503-GGUF
  2. Click the Files tab
  3. Click the desired *.gguf file
  4. Press the Download button in the top right

10-2. Getting a direct URL to a GGUF file

  1. In the Files tab, click the *.gguf file to open its page
  2. Right-click the Download button and select "Copy link"
  3. You now have a direct URL to the GGUF file that you can paste into the app

Tips

Loading a very large model may stop because address-space reservation fails or because the process was interrupted by user action. In that case the app clears temporary load files on the next launch and shows a notice. If needed, try a smaller model or load the model again from Settings.