Fun and playful! Integrate large models into the article page.

Recently, I integrated TianliGPT into the article, achieving an automatic article summary (TL;DR) module. TianliGPT provides a very simple and quick embedding method, but it focuses on summary generation and related article recommendation functions. If you want to expand its capabilities, there are significant limitations. Therefore, I recently abandoned TianliGPT in favor of using Moonshot AI for article summary generation and additional feature expansion.

Determine Requirements#

First, we want to not only generate summaries for articles but also ask relevant questions about the article's content and provide corresponding answers. Clicking on a question will display the answer to that question. The effect is similar to the image below:

Article summary module preview

Based on the above requirements, we need the model to return content in a JSON format similar to the one below, which will then be processed by the frontend:

{
    "summary": "Article summary content",
    "qa": [
        {
            "question": "Question 1",
            "answer": "Answer 1"
        },{
            "question": "Question 2",
            "answer": "Answer 2"
        },
        ...
    ]
}

We can then design the following prompt to give to the model:

Design a concise list of questions aimed at extracting professional concepts or inadequately detailed aspects from the article. Provide an article summary and formulate 6 questions regarding specific concepts. The questions should be precise, and generate corresponding answers. Please output your response in the following JSON format:
{
    "summary": "Article summary content",
    "qa": [
        {
            "question": "Question 1",
            "answer": "Answer 1"
        },
        ...
    ]
}
Note: Please place the article summary content in the summary field, the questions in the question field, and the answers in the answer field.

Note

A prompt refers to the information or instructions provided to the model to guide it in generating specific responses or outputs. It determines how the model understands and processes user input. An effective prompt typically includes several elements: clarity, relevance, conciseness, context, and directive nature.

Since Kimi and the model provided by Moonshot AI are from the same source, we can use Kimi for testing to some extent to predict the results we might obtain when using the Moonshot API ~~(actually to save money)~~. To verify whether this prompt can achieve the desired effect, I tried using it in a conversation with Kimi, and the results are shown in the image below:

Model conversation result

Initiating a Conversation with the Model#

After resolving what to communicate with the model, we need to address how to communicate with the model. The official documentation of Moonshot AI provides implementation methods in both Python and Node.js, but here we will use PHP to achieve the corresponding functionality.

The official API provided for us is Chat Completions: https://api.moonshot.cn/v1/chat/completions, and examples of the request headers and request content are as follows:

# Request headers
{
    "Content-Type": "application/json",
    "Authorization": "Bearer $apiKey"
}

# Request content
{
    "model": "moonshot-v1-8k",
    "messages": [
        {
            "role": "system",
            "content": "You are Kimi, an AI assistant provided by Moonshot AI, and you are better at conversations in Chinese and English. You will provide users with safe, helpful, and accurate answers. At the same time, you will refuse to answer any questions involving terrorism, racial discrimination, pornography, violence, etc. Moonshot AI is a proper noun and should not be translated into other languages."
        },
        { "role": "user", "content": "Hello, my name is Li Lei, what is 1+1?" }
    ],
    "temperature": 0.3
}

model is the model name; currently, Moonshot-v1 has three models: moonshot-v1-8k, moonshot-v1-32k, and moonshot-v1-128k.
The messages array is the list of conversation messages, where the role can be one of system, user, or assistant: system represents system messages that provide context or guidance for the conversation, usually filled with the prompt; user represents the user's message, i.e., the user's question or input; assistant represents the model's reply.
temperature is the sampling temperature, recommended to be 0.3.

We can construct a MoonshotAPI class to implement this functionality:

class MoonshotAPI {
    private $apiKey;
    private $baseUrl;

    public function __construct($apiKey) {
        $this->apiKey = $apiKey;
        $this->baseUrl = "https://api.moonshot.cn/v1/chat/completions";
    }
    
    /**
     * Send and get API response data
     * @param string $model Model name
     * @param array $messages Message array
     * @param float $temperature Temperature parameter that affects the creativity of the response
     * @return mixed API response data
     */
    public function sendRequest($model, $messages, $temperature) {
        $payload = $this->preparePayload($model, $messages, $temperature);
        $response = $this->executeCurlRequest($payload);
        $responseData = json_decode($response, true);
        
        if (json_last_error() !== JSON_ERROR_NONE) {
            throw new RuntimeException("Invalid response format.");
        }
        
        return $responseData;
    }

    /**
     * Construct request content
     * @param string $model Model
     * @param array $messages List of conversation messages
     * @param float $temperature Sampling temperature
     * @return array Constructed request content
     */
    private function preparePayload($model, $messages, $temperature) {
        return [
            'model' => $model,
            'messages' => $messages,
            'temperature' => $temperature,
            'response_format' => ["type" => "json_object"] # Enable JSON Mode
        ];
    }

    /**
     * Send request
     * @param array $data Request data
     * @return string API response data
     */
    private function executeCurlRequest($data) {
        $curl = curl_init();
        curl_setopt_array($curl, [
            CURLOPT_URL => $this->baseUrl,
            CURLOPT_RETURNTRANSFER => true,
            CURLOPT_POST => true,
            CURLOPT_POSTFIELDS => json_encode($data),
            CURLOPT_TIMEOUT => 60,
            CURLOPT_HTTPHEADER => [
                'Content-Type: application/json',
                'Authorization: Bearer ' . $this->apiKey
            ],
        ]);

        $response = curl_exec($curl);

        if ($response === false) {
            $error = curl_error($curl);
            curl_close($curl);
            throw new RuntimeException($error);
        }

        curl_close($curl);
        return $response;
    }
}

Note

If you directly tell the Kimi large model in the prompt: "Please output content in JSON format," the Kimi large model can understand your request and will generate a JSON document as required, but the generated content usually has some flaws, such as outputting additional text outside the JSON document to explain it.

Therefore, we need to enable JSON Mode when constructing the request content, allowing the model to "output a valid, correctly parsable JSON document as requested," which means adding 'response_format' => ["type" => "json_object"] to the return array of the preparePayload method.

It is evident that among the three parameters accepted by the MoonshotAPI class, only the $messages conversation message list is relatively complex, so we create a getMessages function to build the conversation message list array.

/**
 * Build an array of messages
 *
 * @param string $articleText Article content
 * @return array Array containing system and user messages
 */
function getMessages($articleText) {
    return [
        [
            "role" => "system",
            "content" => <<<EOT
Design a concise list of questions aimed at extracting professional concepts or inadequately detailed aspects from the article. Provide an article summary and formulate 6 questions regarding specific concepts. The questions should be precise, and generate corresponding answers. Please output your response in the following JSON format:
{
    "summary": "Article content",
    "qa": [
        {
            "question": "Question 1",
            "answer": "Answer 1"
        },
        ...
    ]
}
Note: Please place the article summary content in the summary field, the questions in the question field, and the answers in the answer field.
EOT
        ],
        [
            "role" => "user",
            "content" => $articleText
        ]
    ];
}

Here, we fill in the initially designed prompt in the system message and the article content in the first user message. In the actual API request, the messages array will be arranged in chronological order, usually starting with the system message, followed by the user's question, and finally the assistant's reply. This structure helps maintain the context and coherence of the conversation.

When we communicate with the model, it will return JSON data like this, where we mainly focus on the choices array.

{
	"id": "chatcmpl-xxxxxx",
	"object": "chat.completion",
	"created": xxxxxxxx,
	"model": "moonshot-v1-8k",
	"choices": [{
		"index": 0,
		"message": {
			"role": "assistant",
			"content": "Here is the model's reply"
		},
		"finish_reason": "stop"
	}],
	"usage": {
		"prompt_tokens": 229,
		"completion_tokens": 64,
		"total_tokens": 293
	}
}

In our conversation mode, the choices array usually contains only one object (which is why we write $result['choices'][0] when retrieving the model's reply and other information), and this object represents the text reply generated by the model. The finish_reason within the object indicates the reason for the completion of the generated text; if the model believes it has provided a complete answer, the value of finish_reason will be stop. Therefore, we can use this to determine whether the content generated by the model is complete. The content within the object is the reply given to us by the model.

Next, we create the Moonshot function to call the MoonshotAPI class:

/**
 * Call the MoonshotAPI class
 *
 * @param string $articleText Article content
 * @param string $model Model to use, default is "moonshot-v1-8k"
 * @return array Returns an array with status code and data
 */
function Moonshot($articleText, $model = "moonshot-v1-8k") {
    $apiKey = 'sk-xxxxxxxx'; # $apiKey is the API Key applied for in the user center
    $moonshotApi = new MoonshotAPI($apiKey);
    $messages = getMessages($articleText);
    $temperature = 0.3;

    try {
        $result = $moonshotApi->sendRequest($model, $messages, $temperature);

        if (isset($result['error'])) {
            throw new RuntimeException("Model returned an error: " . $result['error']['message']);
        }

        # Check if the content generated by the model is complete
        $responseContent = $result['choices'][0]['message']['content'] ?? null;
        if ($responseContent === null || $result['choices'][0]['finish_reason'] !== "stop") {
            throw new RuntimeException("The returned content does not exist or is truncated.");
        }

        # Since we enabled JSON Mode, the model's reply is in standard JSON format
        # So we need to filter out non-standard JSON format replies
        $decodedResponse = json_decode($responseContent, true);
        if (json_last_error() !== JSON_ERROR_NONE) {
            throw new RuntimeException("Invalid response format.");
        }
    
        return $result;
    } catch (Exception $e) {
        return ['stat' => 400, 'message' => $e->getMessage()];
    }
}

At this point, we have obtained code like the one below. If all goes well, calling it directly will get the model's reply✌️. The frontend can then render the reply onto the page, and I won't elaborate further here.

header('Content-Type: application/json');

class MoonshotAPI {...}

function getMessages(...) {...}

function Moonshot(...) {...}

# Usage example
try {
    $article = "This is the article content";
    $aiResponse = Moonshot($article);

    # Directly output or further process the model's returned result
    echo json_encode($aiResponse, JSON_UNESCAPED_UNICODE);
} catch (Exception $e) {
    echo json_encode(['stat' => 400, 'message' => $e->getMessage()], JSON_UNESCAPED_UNICODE);
}

Appendix: Handling Long Articles#

For certain articles, directly calling the above code may result in the following error:

{
	"error": {
		"type": "invalid_request_error",
		"message": "Invalid request: Your request exceeded model token limit: 8192"
	}
}

The reason for this error is that the total number of input and output context tokens exceeds the model's (here we assume we are using the moonshot-v1-8k model) set token limit. We need to choose an appropriate model based on the length of the context, plus the expected output tokens length. For this situation, the documentation also provides an example code on how to choose the appropriate model based on context length, and we just need to convert the code to PHP and integrate it with our previous code.

/**
 * Estimate the number of tokens for the given messages.
 * Use the estimate-token-count API provided in the documentation to estimate the token count for input messages.
 * 
 * @param string $apiKey API key
 * @param array $inputMessages Message array to estimate token count
 * @return int Returns the estimated total token count
 */
function estimateTokenCount($apiKey, $inputMessages) {
    $header = [
        'Authorization: Bearer ' . $apiKey,
    ];
    $data = [
        'model' => 'moonshot-v1-128k',
        'messages' => $inputMessages,
    ];

    $curl = curl_init();
    curl_setopt_array($curl, [
        CURLOPT_URL => 'https://api.moonshot.cn/v1/tokenizers/estimate-token-count',
        CURLOPT_RETURNTRANSFER => true,
        CURLOPT_POST => true,
        CURLOPT_POSTFIELDS => json_encode($data),
        CURLOPT_TIMEOUT => 60,
        CURLOPT_HTTPHEADER => [
            'Authorization: Bearer ' . $apiKey
        ],
    ]);
    $response = curl_exec($curl);
    if ($response === false) {
        $error = curl_error($curl);
        curl_close($curl);
        throw new RuntimeException($error);
    }
    curl_close($curl);
    $result = json_decode($response, true);
    return $result['data']['total_tokens'];
}

/**
 * Choose the most suitable model based on the estimated token count.
 * 
 * @param string $apiKey API key
 * @param array $inputMessages Message array
 * @param int $defaultMaxTokens Default maximum token limit
 * @return string Returns the selected model name
 */
function selectModel($apiKey, $inputMessages, $defaultMaxTokens = 1024) {
    $promptTokenCount = estimateTokenCount($apiKey, $inputMessages);
    $totalAllowedTokens = $promptTokenCount + $defaultMaxTokens;

    if ($totalAllowedTokens <= 8 * 1024) {
        return "moonshot-v1-8k";
    } elseif ($totalAllowedTokens <= 32 * 1024) {
        return "moonshot-v1-32k";
    } elseif ($totalAllowedTokens <= 128 * 1024) {
        return "moonshot-v1-128k";
    } else {
        throw new Exception("Tokens exceed the maximum limit.");
    }
}

In the Moonshot function, when the model returns an error type of invalid_request_error (i.e., exceeding the maximum token limit for that model), we call the selectModel function to choose the most suitable model and then re-engage in conversation with the appropriate model.

function Moonshot($articleText, $model = "moonshot-v1-8k") {
    ...
        if (isset($result['error'])) {
            if ($result['error']['type'] === "invalid_request_error") {
                $model = selectModel($apiKey, $messages);
                return Moonshot($articleText, $model);
           } else {
               throw new RuntimeException("Model returned an error: " . $result['error']['message']);
           }
        }
    ...
}

This article is synchronized and updated to xLog by Mix Space. The original link is https://www.vinking.top/posts/codes/developing-auto-summary-module-using-ai