Likely because they haven't got their own suitable SoTA base models of any other size to build on. DeepSeek V3 is 671B, and DeepSeek-Prover-v1.5 [1] is 7B only, based on DeepSeekMath which is 7B, which is based on DeepSeekCoder-Base-7B-v1.5. Maybe DeepSeek-Coder-V2 (16B and 236B) would be a good start but it's merged into DeepSeek V2.5, and V2.5 is inferior to V3. Or some version of Qwen.
Also notable is the earliest planning for a positive reception release of a new model might include both parameter-based and skill type market segmentation.
--> "In an increasingly crowded field of LLMs, how will our (costly to produce) model stand out?"