青岛网站营销推广深圳优化公司义高粱seo
2026/4/18 15:21:19 网站建设 项目流程
青岛网站营销推广,深圳优化公司义高粱seo,seo如何提高排名,网站推广工具大全零基础玩转IQuest-Coder#xff1a;40B代码大模型实战教程 你是否曾幻想过拥有一个能帮你写代码、查Bug、优化算法的“AI编程搭档”#xff1f;现在#xff0c;它来了#xff01; IQuest-Coder-V1-40B-Instruct 是一款面向软件工程与竞技编程的新一代代码大语言模型…零基础玩转IQuest-Coder40B代码大模型实战教程你是否曾幻想过拥有一个能帮你写代码、查Bug、优化算法的“AI编程搭档”现在它来了IQuest-Coder-V1-40B-Instruct是一款面向软件工程与竞技编程的新一代代码大语言模型LLM在多个权威编码基准测试中表现卓越。本文将带你从零开始手把手完成该模型的本地部署与调用即使你是AI或深度学习新手也能轻松上手。我们将使用vLLM作为推理引擎在多GPU环境下高效运行这个40B参数量的大模型并解决部署过程中可能遇到的关键报错问题。1. 学习目标与前置知识✅ 你能学到什么如何搭建支持大型代码模型的本地推理环境使用vLLM部署 HuggingFace 格式的 LLM 模型解决自定义架构模型不被 vLLM 支持的问题Patch 实操下载并运行 IQuest-Coder-V1-40B-Instruct 指令模型通过 API 调用你的本地 AI 编程助手 前置要求项目推荐配置操作系统Ubuntu 20.04GPU至少4张NVIDIA L20/A100显存≥48GB显存总量≥192GB用于40B模型量化推理CUDA版本12.1Python3.10~3.12磁盘空间≥200GB模型文件约150GB提示如果你没有本地高性能服务器可考虑云平台租用实例如阿里云、AutoDL等进行实验。2. 环境准备构建vLLM推理环境我们首先创建一个独立的虚拟环境来安装所有依赖项避免与其他项目冲突。2.1 创建Python虚拟环境# 创建名为 vllm_env 的虚拟环境 python3 -m venv vllm_env # 激活环境 source vllm_env/bin/activate # 升级pip pip install --upgrade pip2.2 安装核心依赖库# 安装最新版vLLM推荐0.13.0以上 pip install vllm # 安装DLPack扩展部分CUDA操作需要 pip install torch-c-dlpack-ext # 安装魔搭ModelScope客户端用于下载模型 pip install modelscope✅ 此时你的基础推理环境已准备就绪。3. 模型下载获取IQuest-Coder-V1-40B-Instruct该模型托管于ModelScope魔搭社区我们使用其命令行工具下载。3.1 执行下载命令modelscope download \ --model IQuestLab/IQuest-Coder-V1-40B-Loop-Instruct \ --local_dir ./IQuest-Coder-V1-40B-Loop-Instruct说明 ---model指定模型ID ---local_dir本地保存路径⚠️注意由于模型体积巨大FP16约150GB下载时间较长请确保网络稳定和磁盘充足。4. 关键修复为vLLM打补丁以支持IQuest架构直接运行模型会报错Model architectures [IQuestLoopCoderForCausalLM] are not supported这是因为 vLLM 尚未原生支持 IQuest-Coder 的自定义模型结构。我们需要手动添加支持。4.1 修改模型注册表找到 vLLM 安装目录下的模型注册文件vim vllm_env/lib/python3.12/site-packages/vllm/model_executor/models/registry.py在Zamba2ForCausalLM: (zamba2, Zamba2ForCausalLM)后新增两行IQuestLoopCoderForCausalLM: (iquest_loopcoder, IQuestLoopCoderForCausalLM), IQuestCoderForCausalLM: (llama, LlamaForCausalLM),这一步告诉 vLLM当遇到IQuestLoopCoderForCausalLM类型时去加载名为iquest_loopcoder.py的模块。4.2 创建自定义模型实现文件新建文件touch vllm_env/lib/python3.12/site-packages/vllm/model_executor/models/iquest_loopcoder.py将以下完整代码粘贴进去即官方 PR 提供的实现# SPDX-License-Identifier: Apache-2.0 # SPDX-FileCopyrightText: Copyright contributors to the vLLM project # # Licensed under the Apache License, Version 2.0 (the License); # you may not use this file except in compliance with the License. # You may obtain a copy of the License at # # http://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, software # distributed under the License is distributed on an AS IS BASIS, # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. Inference-only LoopCoder model compatible with HuggingFace weights. from __future__ import annotations from collections.abc import Iterable from dataclasses import replace from typing import Any import torch from torch import nn from transformers import PretrainedConfig from vllm.attention.backends.abstract import AttentionType from vllm.attention.layer import Attention from vllm.compilation.decorators import support_torch_compile from vllm.config import CacheConfig, VllmConfig from vllm.distributed import get_tensor_model_parallel_world_size from vllm.model_executor.layers.activation import SiluAndMul from vllm.model_executor.layers.layernorm import LayerNorm from vllm.model_executor.layers.linear import ( ColumnParallelLinear, MergedColumnParallelLinear, QKVParallelLinear, RowParallelLinear, ) from vllm.model_executor.layers.logits_processor import LogitsProcessor from vllm.model_executor.layers.quantization import QuantizationConfig from vllm.model_executor.layers.rotary_embedding import get_rope from vllm.model_executor.layers.vocab_parallel_embedding import ( ParallelLMHead, VocabParallelEmbedding, ) from vllm.model_executor.model_loader.weight_utils import ( default_weight_loader, maybe_remap_kv_scale_name, ) from vllm.sequence import IntermediateTensors from .utils import ( AutoWeightsLoader, extract_layer_index, make_empty_intermediate_tensors_factory, make_layers, maybe_prefix, ) class LoopCoderRMSNorm(nn.Module): LoopCoderRMSNorm is equivalent to T5LayerNorm. def __init__(self, hidden_size: int, eps: float 1e-6): super().__init__() self.weight nn.Parameter(torch.ones(hidden_size)) self.variance_epsilon eps def forward(self, hidden_states: torch.Tensor): input_dtype hidden_states.dtype hidden_states hidden_states.to(torch.float32) variance hidden_states.pow(2).mean(-1, keepdimTrue) hidden_states hidden_states * torch.rsqrt(variance self.variance_epsilon) return self.weight * hidden_states.to(input_dtype) class LoopCoderMLP(nn.Module): def __init__( self, hidden_size: int, intermediate_size: int, hidden_act: str, quant_config: QuantizationConfig | None None, prefix: str , ) - None: super().__init__() self.gate_up_proj MergedColumnParallelLinear( hidden_size, [intermediate_size] * 2, biasFalse, quant_configquant_config, prefixf{prefix}.gate_up_proj, ) self.down_proj RowParallelLinear( intermediate_size, hidden_size, biasFalse, quant_configquant_config, prefixf{prefix}.down_proj, ) if hidden_act ! silu: raise ValueError( fUnsupported activation: {hidden_act}. Only silu is supported for now. ) self.act_fn SiluAndMul() def forward(self, x): gate_up, _ self.gate_up_proj(x) x self.act_fn(gate_up) x, _ self.down_proj(x) return x class LoopCoderAttention(nn.Module): def __init__( self, config: PretrainedConfig, hidden_size: int, num_heads: int, num_kv_heads: int, max_position: int 4096 * 32, cache_config: CacheConfig | None None, quant_config: QuantizationConfig | None None, prefix: str , attn_type: str AttentionType.DECODER, dual_chunk_attention_config: dict[str, Any] | None None, layer_idx: int 0 ) - None: super().__init__() self.layer_idx layer_idx self.hidden_size hidden_size tp_size get_tensor_model_parallel_world_size() self.total_num_heads num_heads assert self.total_num_heads % tp_size 0 self.num_heads self.total_num_heads // tp_size self.total_num_kv_heads num_kv_heads if self.total_num_kv_heads tp_size: # Number of KV heads is greater than TP size, so we partition # the KV heads across multiple tensor parallel GPUs. assert self.total_num_kv_heads % tp_size 0 else: # Number of KV heads is less than TP size, so we replicate # the KV heads across multiple tensor parallel GPUs. assert tp_size % self.total_num_kv_heads 0 self.num_kv_heads max(1, self.total_num_kv_heads // tp_size) self.head_dim hidden_size // self.total_num_heads self.q_size self.num_heads * self.head_dim self.kv_size self.num_kv_heads * self.head_dim self.scaling self.head_dim**-0.5 self.dual_chunk_attention_config dual_chunk_attention_config # Get loop_num from config, default to 2 if not specified self.loop_num getattr(config, loop_num, 2) self.loop_window_size getattr(config, loop_window_size, 64) # Use total number of hidden layers instead of hardcoded 24 total_layers config.num_hidden_layers self.qkv_proj QKVParallelLinear( hidden_size, self.head_dim, self.total_num_heads, self.total_num_kv_heads, biasFalse, quant_configquant_config, prefixf{prefix}.qkv_proj, ) self.o_proj RowParallelLinear( self.total_num_heads * self.head_dim, hidden_size, biasFalse, quant_configquant_config, prefixf{prefix}.o_proj, ) self.rotary_emb get_rope( self.head_dim, max_positionmax_position, rope_parametersconfig.rope_parameters, dual_chunk_attention_configdual_chunk_attention_config, ) self.attn nn.ModuleList() base_cache_config cache_config for loop_idx in range(self.loop_num): base_layer_idx extract_layer_index(prefix) unique_layer_idx loop_idx * total_layers base_layer_idx unique_prefix prefix.replace( flayers.{base_layer_idx}, flayers.{unique_layer_idx} ) if loop_idx 0: loop_cache_config cache_config else: if base_cache_config is not None: loop_cache_config replace( base_cache_config, sliding_windowself.loop_window_size, ) else: loop_cache_config CacheConfig( sliding_windowself.loop_window_size, cache_dtypeauto, ) self.attn.append( Attention( self.num_heads, self.head_dim, self.scaling, num_kv_headsself.num_kv_heads, cache_configloop_cache_config, quant_configquant_config, attn_typeattn_type, prefixf{unique_prefix}.attn, **{ layer_idx: unique_layer_idx, dual_chunk_attention_config: dual_chunk_attention_config, } if dual_chunk_attention_config and loop_idx 0 else {}, ) ) def forward( self, positions: torch.Tensor, hidden_states: torch.Tensor, loop_idx: int, gate_proj: LoopGateProjection | None None, ) - torch.Tensor: if loop_idx 0: attn self.attn[0] qkv, _ self.qkv_proj(hidden_states) q, k, v qkv.split([self.q_size, self.kv_size, self.kv_size], dim-1) q, k self.rotary_emb(positions, q, k) attn_output attn(q, k, v) output, _ self.o_proj(attn_output) return output else: global_attn self.attn[0] local_attn self.attn[loop_idx] qkv, _ self.qkv_proj(hidden_states) q, k, v qkv.split([self.q_size, self.kv_size, self.kv_size], dim-1) q, k self.rotary_emb(positions, q, k) num_tokens, _ q.shape num_heads self.num_heads head_dim self.head_dim q_reshaped q.view(num_tokens, num_heads, head_dim).transpose(0, 1) global_attn_output global_attn(q, None, None) local_attn_output local_attn(q, k, v) assert gate_proj is not None, gate_proj must be provided for loop_idx 0 gate gate_proj(q_reshaped) output global_attn_output * gate local_attn_output * (1 - gate) output, _ self.o_proj(output) return output class LoopCoderDecoderLayer(nn.Module): def __init__( self, config: PretrainedConfig, cache_config: CacheConfig | None None, quant_config: QuantizationConfig | None None, prefix: str , layer_idx: int 0 ) - None: super().__init__() self.hidden_size config.hidden_size dual_chunk_attention_config getattr( config, dual_chunk_attention_config, None ) self.layer_idx layer_idx if getattr(config, is_causal, True): attn_type AttentionType.DECODER else: attn_type AttentionType.ENCODER_ONLY self.self_attn LoopCoderAttention( configconfig, hidden_sizeself.hidden_size, num_headsconfig.num_attention_heads, max_positionconfig.max_position_embeddings, num_kv_headsconfig.num_key_value_heads, cache_configcache_config, quant_configquant_config, prefixf{prefix}.self_attn, attn_typeattn_type, dual_chunk_attention_configdual_chunk_attention_config, layer_idxself.layer_idx, ) self.mlp LoopCoderMLP( hidden_sizeself.hidden_size, intermediate_sizeconfig.intermediate_size, hidden_actconfig.hidden_act, quant_configquant_config, prefixf{prefix}.mlp, ) self.input_layernorm LoopCoderRMSNorm(config.hidden_size, epsconfig.rms_norm_eps) self.post_attention_layernorm LoopCoderRMSNorm(config.hidden_size, epsconfig.rms_norm_eps) def forward( self, positions: torch.Tensor, hidden_states: torch.Tensor, loop_idx: int, gate_proj: LoopGateProjection | None None, ) - tuple[torch.Tensor, torch.Tensor]: residual hidden_states hidden_states self.input_layernorm(hidden_states) hidden_states self.self_attn( positionspositions, hidden_stateshidden_states, loop_idxloop_idx, gate_projgate_proj, ) hidden_states hidden_states residual residual hidden_states hidden_states self.post_attention_layernorm(hidden_states) hidden_states self.mlp(hidden_states) hidden_states hidden_states residual return hidden_states class LoopGateProjection(nn.Module): Gate projection for mixed attention in Loop 2. Computes: g sigmoid(linear(Q)) for each head independently. This gate determines how much to use Loop1s KV (global) vs current loops KV (local). Supports tensor parallelism: each GPU handles a subset of heads. The weight matrix has shape [num_heads, head_dim] and is split along the head dimension. def __init__( self, total_num_heads: int, head_dim: int, quant_config: QuantizationConfig | None None, prefix: str , ): super().__init__() self.total_num_heads total_num_heads self.head_dim head_dim tp_size get_tensor_model_parallel_world_size() assert self.total_num_heads % tp_size 0 self.num_heads self.total_num_heads // tp_size self.gate_proj ColumnParallelLinear( head_dim, self.total_num_heads, biasTrue, gather_outputFalse, quant_configquant_config, prefixprefix, ) def forward(self, query: torch.Tensor) - torch.Tensor: Compute gate values from query tensor. Args: query: [num_heads, num_tokens, head_dim] (vLLM flattened format) where num_heads is the number of heads on this TP rank and num_tokens batch * seq_len Returns: gate: [num_tokens, num_heads * head_dim] (flattened format matching q shape) num_heads, num_tokens, head_dim query.shape assert num_heads self.num_heads, fExpected {self.num_heads} heads, got {num_heads} query_flat query.reshape(-1, head_dim) gate_logits_flat, _ self.gate_proj(query_flat) gate_logits gate_logits_flat.reshape(num_heads, num_tokens, self.num_heads) # [num_heads, num_tokens, num_heads] # Extract diagonal: each head hs query should use output column h # gate_logits[h, :, h] gives the output for head h at each token gate_logits torch.diagonal(gate_logits, dim10, dim22) # [num_tokens, num_heads] gate_logits gate_logits.transpose(0, 1) # [num_heads, num_tokens] gate_logits gate_logits.unsqueeze(-1) # [num_heads, num_tokens, 1] # Apply sigmoid gate torch.sigmoid(gate_logits) # [num_heads, num_tokens, 1] # Expand and reshape to match q shape: [num_tokens, num_heads * head_dim] gate gate.transpose(0, 1) # [num_tokens, num_heads, 1] gate gate.expand(-1, -1, head_dim) # [num_tokens, num_heads, head_dim] gate gate.reshape(num_tokens, num_heads * head_dim) # [num_tokens, num_heads * head_dim] return gate support_torch_compile( dynamic_arg_dims{ input_ids: 0, positions: -1, intermediate_tensors: 0, inputs_embeds: 0, } ) class IQuestLoopCoderModel(nn.Module): def __init__( self, *, vllm_config: VllmConfig, prefix: str , decoder_layer_type: type[nn.Module] LoopCoderDecoderLayer, ): super().__init__() config vllm_config.model_config.hf_config cache_config vllm_config.cache_config quant_config vllm_config.quant_config # TODO (robertgshaw2): see if this can be moved out if cache_config.sliding_window is not None and hasattr( config, max_window_layers ): assert config.max_window_layers config.num_hidden_layers, ( Sliding window for some but all layers is not supported. This model uses sliding window but max_window_layers {} is less than num_hidden_layers {}. Please open an issue to discuss this feature..format( config.max_window_layers, config.num_hidden_layers, ) ) self.config config self.quant_config quant_config self.vocab_size config.vocab_size self.embed_tokens VocabParallelEmbedding( config.vocab_size, config.hidden_size, quant_configquant_config, prefixf{prefix}.embed_tokens, ) self.loop_num getattr(self.config, loop_num, 2) self.window_size getattr(self.config, loop_window_size, 64) # Gate projections for Loop 2 (one per layer) head_dim config.hidden_size // config.num_attention_heads _, _, self.gate_projections make_layers( config.num_hidden_layers, lambda prefix: LoopGateProjection( total_num_headsconfig.num_attention_heads, head_dimhead_dim, quant_configquant_config, prefixprefix, ), prefixf{prefix}.gate_projections, ) self.start_layer, self.end_layer, self.layers make_layers( config.num_hidden_layers, lambda prefix: LoopCoderDecoderLayer( configconfig, cache_configcache_config, quant_configquant_config, prefixprefix, layer_idxextract_layer_index(prefix), ), prefixf{prefix}.layers, ) self.make_empty_intermediate_tensors make_empty_intermediate_tensors_factory( [hidden_states, residual], config.hidden_size ) self.norm LoopCoderRMSNorm(config.hidden_size, epsconfig.rms_norm_eps) def embed_input_ids(self, input_ids: torch.Tensor) - torch.Tensor: return self.embed_tokens(input_ids) def forward( self, input_ids: torch.Tensor, positions: torch.Tensor, intermediate_tensors: IntermediateTensors | None None, inputs_embeds: torch.Tensor | None None, ) - torch.Tensor | IntermediateTensors: if inputs_embeds is not None: hidden_states inputs_embeds else: hidden_states self.embed_input_ids(input_ids) for loop_idx in range(self.loop_num): for layer_idx, layer in enumerate(self.layers[self.start_layer : self.end_layer]): # Get the actual layer index (accounting for pipeline parallelism) actual_layer_idx self.start_layer layer_idx # Get gate_proj for this layer (only for loop_idx 0) gate_proj ( self.gate_projections[actual_layer_idx] if loop_idx 0 else None ) hidden_states layer( positions, hidden_states, loop_idx, gate_proj ) hidden_states self.norm(hidden_states) return hidden_states def load_weights(self, weights: Iterable[tuple[str, torch.Tensor]]) - set[str]: stacked_params_mapping [ # (param_name, shard_name, shard_id) (qkv_proj, q_proj, q), (qkv_proj, k_proj, k), (qkv_proj, v_proj, v), (gate_up_proj, gate_proj, 0), (gate_up_proj, up_proj, 1), ] params_dict dict(self.named_parameters(remove_duplicateFalse)) loaded_params: set[str] set() for name, loaded_weight in weights: if rotary_emb.inv_freq in name: continue if self.quant_config is not None and ( scale_name : self.quant_config.get_cache_scale(name) ): # Loading kv cache quantization scales param params_dict[scale_name] weight_loader getattr(param, weight_loader, default_weight_loader) loaded_weight ( loaded_weight if loaded_weight.dim() 0 else loaded_weight[0] ) weight_loader(param, loaded_weight) loaded_params.add(scale_name) continue for param_name, weight_name, shard_id in stacked_params_mapping: if gate_projections in name: continue if weight_name not in name: continue name name.replace(weight_name, param_name) # Skip loading extra bias for GPTQ models. if name.endswith(.bias) and name not in params_dict: continue if name.endswith(scale): # Remapping the name of FP8 kv-scale. name maybe_remap_kv_scale_name(name, params_dict) if name is None: continue param params_dict[name] weight_loader getattr(param, weight_loader, default_weight_loader) if weight_loader default_weight_loader: weight_loader(param, loaded_weight) else: weight_loader(param, loaded_weight, shard_id) break else: if name.startswith(gate_projections.): if name.endswith(.weight): vllm_name name.replace(.weight, .gate_proj.weight) elif name.endswith(.bias): vllm_name name.replace(.bias, .gate_proj.bias) else: continue if vllm_name in params_dict: param params_dict[vllm_name] weight_loader getattr(param, weight_loader, default_weight_loader) weight_loader(param, loaded_weight) loaded_params.add(vllm_name) continue continue if name.endswith(.bias) and name not in params_dict: continue # Remapping the name of FP8 kv-scale. name maybe_remap_kv_scale_name(name, params_dict) if name is None: continue param params_dict[name] weight_loader getattr(param, weight_loader, default_weight_loader) weight_loader(param, loaded_weight) loaded_params.add(name) return loaded_params class IQuestLoopCoderForCausalLM(nn.Module): def __init__(self, *, vllm_config: VllmConfig, prefix: str ): super().__init__() config vllm_config.model_config.hf_config quant_config vllm_config.quant_config self.config config self.quant_config quant_config self.model IQuestLoopCoderModel( vllm_configvllm_config, prefixmaybe_prefix(prefix, model) ) if config.tie_word_embeddings: self.lm_head self.model.embed_tokens else: self.lm_head ParallelLMHead( config.vocab_size, config.hidden_size, quant_configquant_config, prefixmaybe_prefix(prefix, lm_head), ) self.logits_processor LogitsProcessor(config.vocab_size) self.make_empty_intermediate_tensors ( self.model.make_empty_intermediate_tensors ) def embed_input_ids(self, input_ids: torch.Tensor) - torch.Tensor: return self.model.embed_input_ids(input_ids) def forward( self, input_ids: torch.Tensor, positions: torch.Tensor, intermediate_tensors: IntermediateTensors | None None, inputs_embeds: torch.Tensor | None None, ) - torch.Tensor | IntermediateTensors: hidden_states self.model( input_ids, positions, intermediate_tensors, inputs_embeds ) return hidden_states def compute_logits( self, hidden_states: torch.Tensor, ) - torch.Tensor | None: logits self.logits_processor(self.lm_head, hidden_states) return logits def load_weights(self, weights: Iterable[tuple[str, torch.Tensor]]) - set[str]: loader AutoWeightsLoader( self, skip_prefixes([lm_head.] if self.config.tie_word_embeddings else None), ) return loader.load_weights(weights)✅ 至此vLLM 已完全支持 IQuest-Coder 模型。5. 启动模型服务一切准备就绪启动模型vllm serve ./IQuest-Coder-V1-40B-Loop-Instruct \ --host 0.0.0.0 \ --port 8000 \ --tensor-parallel-size 4 \ --trust-remote-code \ --dtype bfloat16 \ --gpu-memory-utilization 0.85参数说明参数作用--tensor-parallel-size 4使用4块GPU做张量并行--trust-remote-code允许加载自定义模型类--dtype bfloat16使用bfloat16精度节省显存--gpu-memory-utilization 0.85控制显存利用率防止OOM启动成功后你会看到类似输出INFO: Started server process [PID] INFO: Waiting for application startup. INFO: Application startup complete. INFO: Uvicorn running on http://0.0.0.0:8000 恭喜你的IQuest-Coder 40B模型已成功运行6. 调用模型API体验AI编程助手你可以通过 OpenAI 兼容接口调用该模型。示例请求Pythonimport requests url http://localhost:8000/v1/completions headers {Content-Type: application/json} data { model: ./IQuest-Coder-V1-40B-Loop-Instruct, prompt: 写一个快速排序的Python实现并加上详细注释。, max_tokens: 512, temperature: 0.2 } response requests.post(url, jsondata, headersheaders) print(response.json()[choices][0][text])返回示例节选def quicksort(arr): 快速排序函数 参数: arr - 待排序列表 返回: 排好序的新列表 if len(arr) 1: return arr pivot arr[len(arr) // 2] left [x for x in arr if x pivot] middle [x for x in arr if x pivot] right [x for x in arr if x pivot] return quicksort(left) middle quicksort(right)是不是又快又准这才是真正的“智能编程”7. 总结本文带你完成了IQuest-Coder-V1-40B-Instruct模型的完整本地部署流程✅ 搭建了基于 vLLM 的高性能推理环境✅ 成功下载并加载了百亿参数代码模型✅ 解决了因架构不兼容导致的Model architectures [...] are not supported报错✅ 通过打补丁方式扩展了 vLLM 对新模型的支持能力✅ 实现了本地 API 服务调用打造专属 AI 编程助手这款模型不仅能在常规编码任务中表现出色在SWE-Bench Verified76.2%和LiveCodeBench v681.1%等复杂软件工程评测中也处于领先水平是当前最具潜力的代码大模型之一。获取更多AI镜像想探索更多AI镜像和应用场景访问 CSDN星图镜像广场提供丰富的预置镜像覆盖大模型推理、图像生成、视频生成、模型微调等多个领域支持一键部署。

需要专业的网站建设服务?

联系我们获取免费的网站建设咨询和方案报价,让我们帮助您实现业务目标

立即咨询