Published in Better Programming·May 27Convert Text to Phoneme in PythonSimple phonemization of words and texts in many languages — By reading this piece, you will learn how to convert an input text string to its corresponding phonemes in Python. A phoneme represents the smallest unit of sound in a language. For example, the word tab consists of three phonemes: /t/ /a/ /b/ The element b is distinguishable when compared…Programming5 min read
Published in Towards Data Science·May 1920 Open-Source Single Speaker Speech DatasetsA comprehensive open-source multi-lingual speech data — Speech synthesis, also known as text-to-speech (TTS) is one of the new key technologies in the artificial intelligence domain. It provides the capabilities to generate human-like voices from text input dynamically. TTS can be applied in a variety of purposes and tied closely with automation services. However, training a text-to-speech…Text To Speech8 min read
Published in Towards Data Science·Apr 28Convert PASCAL VOC XML to YOLO for Object DetectionTips and tricks to preprocess image datasets — This tutorial covers the following step-by-step guides: convert XML annotations to YOLO annotations visualize the bounding boxes in image using the newly created YOLO annotations split the datasets into train, validation and test sets Overview PASCAL VOC XML The PASCAL Visual Object Classes (VOC) project is one of the earliest computer vision project that…Object Detection9 min read
Published in Better Programming·Apr 15Data Augmentation With AugLyAll-in-one augmentation packages for machine learning — In the world of machine learning, data augmentation is one of the most useful techniques to enhance the performance of ML models. Data augmentation serves to create synthetic data via slight modification or transformation to the existing data. This helps to: increase the amount of training and test data. reduce…Programming4 min read
Published in Level Up Coding·Mar 8How to Remove Personally Identifiable Information (PII) from Audio and VideosRemoving Personally Identifiable Information (PII) from transcription text In my previous article, I have covered How to Transcribe Audio Files to Text. In this tutorial, let’s explore a little further on how to remove Personally Identifiable Information (PII) from the transcription. Based on US Office of Privacy and Open Government…Python6 min read
Published in Level Up Coding·Mar 2Profanity Filtering in SpeechReplace offensive words with asterisks Previously, I have covered a tutorial on Speech Content Safety Detection, which identifies sensitive content such as pornography, terrorism, and hate speech in speech. …Python5 min read
Published in Level Up Coding·Feb 21How to Detect Topics in SpeechIdentify relevant topics based on IAB Taxonomy Topic detection is a technique to discover the abstract topic behind a collection of documents. It is mostly part of the natural language processing technique to classify text into specific topics/domains. Although there are no rules and regulations on how topics should be…Python5 min read
Published in Better Programming·Feb 9Build a Speech Content Safety Detection Tool Using PythonIdentify sensitive content in your audio/video recordings — Content Moderation is when user-generated content is actively monitored on a platform to determine whether or not the content is permissible to be published or used on it. This is mainly to prevent and filter out any potentially offensive or unrelated content from affecting the users’ experience on the platform. …Programming6 min read
Published in Towards Data Science·Jan 31Convert Images to Tensors in Pytorch and TensorflowLearn to transform data natively — As you have known, tensors represents the building blocks for your machine learning project. They are are immutable and can only be created and not updated. …Pytorch4 min read
Published in Level Up Coding·Jan 4Speech Recognition with Sentiment AnalysisClassify speech as positive, negative, or neutral The topic for today is sentiment analysis on voice data. For your information, sentiment analysis is a technique to identify, extract and quantify data as positive, negative, or neutral. …Sentiment Analysis4 min read