In contrast to the transcript or other tagging that do not directly refer to the speech signal via the physical time scale a segmentation always contains a combination of time information and categorical content. We distinguish here between segments vs. points-in-time as well as between manual vs. automatic segmentation and labeling