V2EX = way to explore
V2EX 是一个关于分享和探索的地方
现在注册
已注册用户请  登录
stevechan
V2EX  ›  程序员

[语音识别] 讯飞语音 SDK (Clojure), 语音识别/语音朗读

  •  
  •   stevechan · 2017-07-13 15:01:36 +08:00 · 2970 次点击
    这是一个创建于 2684 天前的主题,其中的信息可能已经有所发展或是发生改变。

    xunfei-clj 源码链接

    Clojure 封装讯飞语音 SDK, 可提供给 Emacs/Vim 编辑器使用,或者命令行, 实现语音朗读提醒 /语音识别 /语音转为命令等

    目前只支持 Linux 和 Windows 系统,因为讯飞官方 SDK 暂未支持 Mac

    Usage: 查看使用示例 hello-xunfei

    ;; 1. add to project.clj.
    [xunfei-clj "0.1.4-SNAPSHOT"]
    
    ;; 2. add Msc.jar to project's lib path, then add `:resource-paths` option.
    :resource-paths ["lib/Msc.jar"]
    
    ;; 3. copy libmsc64.so(windows: msc64.dll) & libmsc32.so(windows: msc32.dll) to your project root path.
    
    ;; 4. core.clj:
    (ns hello-xunfei.core
      (:require [xunfei-clj.core :as xunfei]))
    
    ;; 讯飞初始化
    (xunfei/app-init "your-xunfei-appid") ;; 可以自行到讯飞开放平台注册一个 appid
    
    ;; 语音朗读
    (defn xunfei-say-hi
      [text]
      (xunfei/text-to-player text))
    
    ;; 语音识别
    (def regcog-res (atom (list)))
    (xunfei/record-voice-to-text (fn [] (xunfei/m-reco-listener #(swap! regcog-res conj %))) )
    
    

    Develop

    $ lein repl 
    
    ;; 讯飞初始化
    xunfei-clj.core> (xunfei/app-init "your-xunfei-appid")
    
    ;; 语音朗读
    xunfei-clj.core> (text-to-player "什么语音文学驱动编程?")
    
    ;; 语音识别
    xunfei-clj.core> (def regcog-res (atom (list)))
    xunfei-clj.core> (record-voice-to-text (fn [] (m-reco-listener #(swap! regcog-res conj %))) )
    
    
    2 条回复    2017-07-18 11:09:47 +08:00
    stevechan
        1
    stevechan  
    OP
       2017-07-13 15:17:10 +08:00
    欢迎给我提 Issues 和 Start 呀 :-)
    stevechan
        2
    stevechan  
    OP
       2017-07-18 11:09:47 +08:00
    不到 100 行代码就实现啦

    ```clojure

    (ns xunfei-clj.core
    (:require [cheshire.core :as cjson])
    (:import [com.iflytek.cloud.speech
    SpeechRecognizer
    SpeechConstant
    SpeechUtility
    SpeechSynthesizer
    SynthesizerListener
    SynthesizeToUriListener
    SpeechError
    RecognizerListener
    RecognizerResult]
    [org.json JSONArray JSONObject JSONTokener]))

    ;; 讯飞初始化: (app-init "59145fb0") , 可以自行到讯飞开放平台注册一个 appid, 或者用本人的测试
    (defn app-init
    [appid]
    (let [appid (str SpeechConstant/APPID "=" appid)]
    (SpeechUtility/createUtility appid)))

    ;; 设置合成监听器,对 SynthesizerListener 进行 proxy,添加对象属性控制
    (defn m-syn-listener-gen
    []
    (proxy [SynthesizerListener] []
    (onCompleted [_])
    (onBufferProgress [^Integer percent ^Integer begin-pos ^Integer end-pos ^String info])
    (onSpeakBegin [])
    (onSpeakPaused [])
    (onSpeakProgress [^Integer percent ^Integer begin-pos ^Integer end-pos])
    (onSpeakResumed [])))

    ;; (read-text-as-voice "输入文本,用讯飞语音合成器, 合成发音播放" (fn [mTts text] ...播放或者是保存到音频文件...) )
    (defn read-text-to-voice
    [text output-fn]
    (let [m-tts (doto (SpeechSynthesizer/createSynthesizer)
    (.setParameter SpeechConstant/VOICE_NAME "xiaoyan")
    (.setParameter SpeechConstant/SPEED "50")
    (.setParameter SpeechConstant/VOLUME "80")
    (.setParameter SpeechConstant/TTS_AUDIO_PATH "./tts_test.pcm"))]
    (output-fn m-tts text)))

    ;; (text-to-player "这里是文本播放语音")
    (defn text-to-player
    [text]
    (read-text-to-voice
    text
    (fn [m-tts text] (.startSpeaking m-tts text (m-syn-listener-gen)))))

    ;; 将 text 合成的语音保存到文件的合成器
    (defn synthesize-to-uri-listener
    []
    (proxy [SynthesizeToUriListener] []
    (onBufferProgress [^Integer progress])
    (onSynthesizeCompleted [^String uri ^SpeechError error])))

    ;; (text-to-vfile "将 text 合成的语音保存到文件" "testest.wav")
    (defn text-to-vfile
    [text url]
    (read-text-to-voice
    text
    (fn [m-tts text]
    (.synthesizeToUri m-tts text url (synthesize-to-uri-listener)))))

    ;; =======>>>> 下面是语音识别生成文本 ====>>>>>

    ;; 语音识别监听器 Usage:
    ;; (def regcog-res (atom (list)))
    ;; (m-reco-listener #(swap! regcog-res conj %))
    (defn m-reco-listener
    [result-fn]
    (proxy [RecognizerListener] []
    (onResult [^RecognizerResult results ^Boolean is-last]
    (let [res (-> results .getResultString cjson/parse-string)]
    (println "识别语音结果:=>" res)
    (result-fn res)))
    (onError [^SpeechError error] (.getPlainDescription error true) )
    (onBeginOfSpeech [])
    (onVolumeChanged [^Integer volume])
    (onEndOfSpeech [])
    (onEvent [^Integer eventType ^Integer arg1 ^Integer arg2 ^String msg])))

    ;; (record-voice-to-text)
    (defn record-voice-to-text
    [m-reco-listener]
    (let [m-iat
    (doto (SpeechRecognizer/createRecognizer)
    (.setParameter SpeechConstant/DOMAIN "iat")
    (.setParameter SpeechConstant/LANGUAGE "zh_cn")
    (.setParameter SpeechConstant/ACCENT "mandarin"))]
    (.startListening m-iat (m-reco-listener))))

    ```
    关于   ·   帮助文档   ·   博客   ·   API   ·   FAQ   ·   实用小工具   ·   5558 人在线   最高记录 6679   ·     Select Language
    创意工作者们的社区
    World is powered by solitude
    VERSION: 3.9.8.5 · 26ms · UTC 05:56 · PVG 13:56 · LAX 21:56 · JFK 00:56
    Developed with CodeLauncher
    ♥ Do have faith in what you're doing.