VNU-UET Repository: No conditions. Results ordered -Date Deposited. 2024-03-28T23:36:37ZEPrintshttp://eprints.uet.vnu.edu.vn/images/sitelogo.pnghttps://eprints.uet.vnu.edu.vn/eprints/2021-12-10T10:58:19Z2021-12-10T10:58:19Zhttp://eprints.uet.vnu.edu.vn/eprints/id/eprint/4661This item is in the repository with the URL: http://eprints.uet.vnu.edu.vn/eprints/id/eprint/46612021-12-10T10:58:19ZMAI_ARM: Robot Tay Máy Thông Minh sử dụng Trí tuệ nhân tạo Đa thể thứcCác hệ thống robot tay máy đóng một vai trò rất quan trọng trong sản xuất công nghiệp hiện đại. Việc ứng dụng robot tay máy trong đời sống hàng ngày sẽ giúp ích con người rất nhiều, ví dụ như tự động hóa hoàn toàn quy trình phân phát thuốc và thực phẩm cho bệnh nhân nhiễm COVID-19. Ứng dụng robot tay máy trong đời sống đòi hỏi một cách thức giúp người dùng dễ dàng tương tác với robot. Trong bài báo này, một phương pháp tương tác với robot tay máy sử dụng trí tuệ nhân tạo đa thể thức kết hợp giọng nói-hình ảnh được xây dựng. Bài báo cũng đề xuất phương pháp Chessboard Calibration giúp nâng cao độ chính xác trong việc xác định vị trí thực thi của robot tay máy. Một robot tay máy 4 bậc tự do (4-DOF) chế tạo bằng phương pháp in 3D sẽ được sử dụng để thực thi và đánh giá mô hình trí tuệ nhân tạo đa thể thức xây dựng được. Kết quả thực thi hệ thống được ghi hình và trình bày tại: https://youtu.be/RHgAyHXMH6IBao Minh Dinhminhdinh@vnu.edu.vnDuc Son Trantransonhhc12c5@gmail.comThe Huong Nguyenhuongrbe@gmail.comNam Dodonam.2801@gmail.comMinh Hoang Lelhoang17062000@gmail.comVan Xiem Hoangxiemhoang@vnu.edu.vn2021-12-10T10:58:14Z2021-12-10T10:58:14Zhttp://eprints.uet.vnu.edu.vn/eprints/id/eprint/4660This item is in the repository with the URL: http://eprints.uet.vnu.edu.vn/eprints/id/eprint/46602021-12-10T10:58:14ZĐánh Giá và Tối Ưu Thuật Toán Hector SLAM Ứng Dụng Lập Bản Đồ và Định Vị Trên Pimouse RobotLập bản đồ và định vị là hai trong bốn bài toán cơ bản của hệ thống Robot di động. Hai cách tiếp cận phổ biến hiện nay để giải quyết bài toán này là sử dụng hệ thống LiDAR hoặc/và hệ thống cảm biến hình ảnh cùng các thuật toán xử lý dữ liệu thu được. Hướng tiếp cận với LiDAR và thuật toán Hector SLAM cho kết quả tạo bản đồ với độ chính xác cao, nhưng đòi hỏi phải tối ưu các tham số của thuật toán. Để hiểu rõ vấn đề này, chúng tôi nghiên cứu và đánh giá các tham số chính ảnh hưởng tới hiệu năng thực thi của thuật toán Hector SLAM cho một hệ thống Robot di động sử dụng LIDAR để lập bản đồ và định vị. Hiệu năng của hệ thống được đánh giá trên hai khía cạnh: i) chất lượng của bản đồ thu được và ii) lượng CPU chiếm dụng. Với việc hiểu rõ ảnh hưởng của các tham số của thuật toán Hector SLAM tới hiệu năng của hệ thống, người dùng có thể thay đổi linh hoạt các tham số này tùy vào Robot sử dụng. Kết quả nghiên cứu được minh họa trên một hệ thống Robot di động được phát triển bởi công ty RT corporation, Nhật bản, Pimouse Robot.Bao Minh Dinhminhdinh@vnu.edu.vnAnh Viet Dangvietda@vnu.edu.vnCanh Thanh Nguyencanhthanhlt@gmail.comVan Xiem Hoangxiemhoang@vnu.edu.vn2021-12-10T10:58:11Z2021-12-10T10:58:11Zhttp://eprints.uet.vnu.edu.vn/eprints/id/eprint/4659This item is in the repository with the URL: http://eprints.uet.vnu.edu.vn/eprints/id/eprint/46592021-12-10T10:58:11ZFast QTMT for H.266/VVC Intra Prediction using Early-Terminated Hierarchical CNN modelVersatile Video Coding (VVC) has been standardization in July 2020. Compared to previous High Efficiency Video Coding (HEVC) standard, VVC saves up to 50% bitrate for equal perceptual video quality. To reach this efficiency, Joint Video Experts Team (JVET) has introduced a number of improvement techniques to VVC model. As a result, the complexity of VVC encoding also greatly increases. One of
the new techniques affects to the growing of complexity is the quad-tree nested multi-type tree (QTMT) including binary split and ternary splits, which lead to a block in VVC with various shapes in both square and rectangle. Based on the
aforementioned information we propose in this paper a new deep learning based fast QTMT method. We use a learned
convolutional neural network (CNN) model namely EarlyTerminated Hierarchical CNN to predict the coding unit map and then fed into the VVC encoder to early terminate the block partitioning process. Experimental results show that the proposed method can save 30.29% encoding time with a
negligible BD-Rate increase.Van Xiem Hoangxiemhoang@vnu.edu.vnQuang Sang Nguyenngsang998@gmail.comBao Minh Dinhminhdinh@vnu.edu.vnNgoc Minh Dongocminhc2nc1@gmail.comTrieu Duong Dinhduongdt@vnu.edu.vn2020-12-08T15:30:57Z2020-12-08T15:30:57Zhttp://eprints.uet.vnu.edu.vn/eprints/id/eprint/4203This item is in the repository with the URL: http://eprints.uet.vnu.edu.vn/eprints/id/eprint/42032020-12-08T15:30:57ZCải tiến thuật toán TZ Search cho tăng tốc mô hình mã hóa H.266/Versatile Video CodingCải tiến các chuẩn mã hóa video đang được quan tâm nhiều trong thời gian gần đây nhằm đáp ứng nhu cầu ngày càng cao của các ứng dụng truyền thông đa phương tiện. Cho đến thời điểm hiện tại, chuẩn mã hóa video mới nhất là chuẩn H.266/VVC (Versatile Video Coding). Với những nỗ lực cải tiến, chuẩn H.266/VVC đạt được lượng bit tiết kiệm lên đến 50% khi so sánh với chuẩn mã hóa video phổ biến H.265/HEVC (High Efficiency Video Coding) trong khi vẫn đảm bảo chất lượng video sau giải mã không đổi. Tuy nhiên, để đạt được hiệu năng mã hóa cao như vậy, chuẩn H.266/VVC yêu cầu thời gian mã hóa gấp 5-30 lần so với chuẩn H.265/HEVC. Nguyên nhân chính đến từ việc phải tìm kiếm khối phù hợp trong một không gian rộng lớn và nhiều trường hợp tìm kiếm hơn. Để giải quyết vấn đề này, bài báo đề xuất một thuật toán cải tiến tìm kiếm nhanh TZ-Search (Test Zone Search) với khả năng tăng tốc độ mã hóa tốt hơn khi dùng trong chuẩn H.266/VVC. Kết quả đánh giá cho thấy, thuật toán TZ-Search cải tiến có thể giúp giảm thời gian mã hóa video H.266/VVC tới 12,6% so với TZ-Search gốc, trong khi vẫn đảm bảo được hiệu năng mã hóa cao.Thanh Huong BuiQuang Sang NguyenTrieu Duong DinhDuc Trinh ChuVan Xiem Hoang2020-07-18T02:47:19Z2020-07-18T02:47:19Zhttp://eprints.uet.vnu.edu.vn/eprints/id/eprint/4027This item is in the repository with the URL: http://eprints.uet.vnu.edu.vn/eprints/id/eprint/40272020-07-18T02:47:19ZImproving performance of distributed video coding by consecutively refining of side information and correlation noise modelDistributed video coding (DVC) is built on distributed source coding (DSC) principles where the video statistics are exploited, partly or fully, at the decoder instead of the encoder. In theory, DVC scheme is proved that there is no performance loss when compared to predictive video coding. However, its practical implementation has a large gap to achieve the theoretically optimum performance. The DVC coding efficiency depends mainly on creating the side information (SI) - a noisy version of original Wyner-Ziv frame (WZF) at the decoder, and modeling the correlation noise - the difference between the original WZF and corresponding SI. Performance of the DVC scheme will be improved if the SI and correlation noise are estimated as accurately as possible. So, this paper proposes a method to enhance the quality of SI and also correlation noise model by using information in decoded WZFs during the decoding …Tien Vu Huutienvh@ptit.edu.vnthao Nguyen thi huongthaotb07@gmail.comMinh Nguyen ngocVan Xiem Hoangxiemhoang@vnu.edu.vn2020-07-18T02:46:31Z2020-07-18T02:46:31Zhttp://eprints.uet.vnu.edu.vn/eprints/id/eprint/4026This item is in the repository with the URL: http://eprints.uet.vnu.edu.vn/eprints/id/eprint/40262020-07-18T02:46:31ZAdaptive Quantization Parameter Estimation for HEVC Based Surveillance Scalable Video CodingVisual surveillance systems have been playing a vital role in human modern life with a large number of applications, ranging from remote home management, public security to traffic monitoring. The recent High Efficiency Video Coding (HEVC) scalable extension, namely SHVC, provides not only the compression efficiency but also the adaptive streaming capability. However, SHVC is originally designed for videos captured from generic scenes rather than from visual surveillance systems. In this paper, we propose a novel HEVC based surveillance scalable video coding (SSVC) framework. First, to achieve high quality inter prediction, we propose a long-term reference coding method, which adaptively exploits the temporal correlation among frames in surveillance video. Second, to optimize the SSVC compression performance, we design a quantization parameter adaptation mechanism in which the relationship between SSVC rate-distortion (RD) performance and the quantization parameter is statistically modeled by a fourth-order polynomial function. Afterwards, an appropriate quantization parameter is derived for frames at long-term reference position. Experiments conducted for a common set of surveillance videos have shown that the proposed SSVC significantly outperforms the relevant SHVC standard, notably by around 6.9% and 12.6% bitrate saving for the low delay (LD) and random access (RA) coding configurations, respectively while still providing a similar perceptual decoded frame quality.Van Xiem Hoangxiemhoang@vnu.edu.vn2020-07-18T02:46:20Z2020-07-18T02:46:20Zhttp://eprints.uet.vnu.edu.vn/eprints/id/eprint/4025This item is in the repository with the URL: http://eprints.uet.vnu.edu.vn/eprints/id/eprint/40252020-07-18T02:46:20ZImproving TDWZ Correlation Noise Estimation: A Deep Learning based ApproachTransform domain Wyner-Ziv video coding (TDWZ) has shown its benefits in compressing video applications with limited resources such as visual surveillance systems, remote sensing and wireless sensor networks. In TDWZ, the correlation noise model (CNM) plays a vital role since it directly affects to the number of bits needed to send from the encoder and thus the overall TDWZ compression performance. To achieve CNM with high accurate for TDWZ, we propose in this paper a novel CNM estimation approach in which the CNM with Laplacian distribution is adaptively estimated based on a deep learning (DL) mechanism. The proposed DL based CNM includes two hidden layers and a linear activation function to adaptively update the Laplacian parameter. Experimental results showed that the proposed TDWZ codec significantly outperforms the relevant benchmarks, notably by around 35% bitrate saving when compared to the DISCOVER codec and around 22% bitrate saving when compared to the HEVC Intra benchmark while providing a similar perceptual quality.Tien Vu Huutienvh@ptit.edu.vnNguyen Thi Huong Thaothaotb07@gmail.comSan Vu vansanvv@ptit.edu.vnVan Xiem Hoangxiemhoang@vnu.edu.vn2019-12-10T03:50:20Z2019-12-10T03:50:20Zhttp://eprints.uet.vnu.edu.vn/eprints/id/eprint/3785This item is in the repository with the URL: http://eprints.uet.vnu.edu.vn/eprints/id/eprint/37852019-12-10T03:50:20ZPhân tích, Đánh giá hiệu năng mã hóa video với chuẩn H.265/HEVCNgày nay, song song với sự ra đời của các loại
video chất lượng cao (độ phân giải lên đến 8k×4k) thì nhu cầu về dịch vụ video đáp ứng được khả năng truyền dẫn (chưa thể tăng ngay về thông lượng) trên hạ tầng mạng hiện tại cũng như trên các thiết bị cá nhân hiện đại (như điện thoại di động, máy tính bảng) ngày càng gia tăng. Việc cải tiến các chuẩn mã hóa (nén) video là nhu cầu tất yếu đã và đang được quan tâm phát triển trong vài thập kỷ qua. Chuẩn mã hóa video hiện đại nhất cho đến thời điểm này là chuẩn H.265/HEVC (High Efficiency Video Coding) với nhiều cải tiến về hiệu quả mã hóa, tích hợp hệ thống truyền tải, khôi phục mất mát dữ liệu cũng như thực hiện kiến trúc xử lý song song. Trên cơ sở đó, bài báo trình bày kiến trúc tổng quan của chuẩn H.265/HEVC, phân tích những điểm mới, nổi bật so với các chuẩn mã hóa video cũ và mô phỏng đánh giá hiệu năng của chuẩn H.265/HEVC trên cơ sở hai yếu tố: hiệu năng nén và độ phức tạp thuật toán. Kết quả nghiên cứu sẽ góp phần thúc đẩy việc khai thác chuẩn H.265/HEVC trong các ứng dụng thực tế tại Việt Nam và trên thế giới.Thanh Huong Buihuong1204@gmail.comCong Huy Phihuypc@ptit.edu.vnTrieu Duong Dinhduongdt@vnu.edu.vnVan Xiem Hoangxiemhoang@vnu.edu.vn2019-05-28T13:41:36Z2019-05-28T13:41:36Zhttp://eprints.uet.vnu.edu.vn/eprints/id/eprint/3457This item is in the repository with the URL: http://eprints.uet.vnu.edu.vn/eprints/id/eprint/34572019-05-28T13:41:36ZDistortion Model based on Perceptual of Local Image ContentAs humans are the ultimate receivers of the majority of visual signals being processed, the most accurate way of assessing image quality is to ask humans for their opinions of an image’s quality, known as the subjective visual quality assessment
(VQA). The subjective image quality scores gathered from all subjects are processed to be the mean opinion score (MOS),
which is regarded as the ground truth of image quality. Due to the fact that the human visual system (HVS) is differently
sensitive to features of image patch, a novel coding distortion modelling method for local image perception is proposed in this paper. An experimental quality assessment to approach database for image patch has been developed. Mean opinion score is regarded as an essential parameter meanwhile the QP-MOS sigmoid curve is determined by local image content.Thanh Tung Phamtung@vinafire.com.vnTrieu Duong Dinhduongdt@vnu.edu.vnVan Xiem Hoangxiemhoang@vnu.edu.vnTien Vu Huutienvh@ptit.edu.vnThanh Ha Leltha@vnu.edu.vn2019-05-28T13:41:16Z2019-05-28T13:41:16Zhttp://eprints.uet.vnu.edu.vn/eprints/id/eprint/3458This item is in the repository with the URL: http://eprints.uet.vnu.edu.vn/eprints/id/eprint/34582019-05-28T13:41:16ZPHƯƠNG PHÁP MÃ HÓA VÀ GIẢI MÃ VIDEO PHÂN TÁNSáng chế đề xuất phương pháp mã hóa và giải mã hóa video phân tán bao gồm quy trình mã hóa và quy trình giải mã, trong đó, phương pháp mã hóa video phân tán được thực hiện trước tiên tại bên mã hóa (bộ mã hóa - WZ encoder) với các bước: phân chia chuỗi video, mã hóa khung hình KEY, mã hóa các khung hình WZ và sau đó thực hiện tại bên giải mã (bộ giải mã WZ decoder) với các bước: giải mã hóa các khung hình KEY, giải mã hóa các khung hình WZ, trong đó việc mã hóa và giải mã các khung hình KEY sử dụng bộ mã hóa/ giải mã JEM (Joint Exploration model)Van Xiem Hoangxiemhoang@vnu.edu.vnTrieu Duong Dinhduongdt@vnu.edu.vn2019-05-27T08:30:46Z2019-05-27T08:30:46Zhttp://eprints.uet.vnu.edu.vn/eprints/id/eprint/3450This item is in the repository with the URL: http://eprints.uet.vnu.edu.vn/eprints/id/eprint/34502019-05-27T08:30:46ZA Frame Loss Concealment Solution for Spatial Scalable HEVC using Base Layer MotionScalable High Efficiency Video Coding (SHVC), is the most recent video coding solution mainly designed for network adaptive or device adaptive applications. SHVC follows a layered coding structure with one base layer (BL) and one or several enhancement layers (ELs) which can be unequally protected. However, SHVC is often sensitive to the packet loss in unreliable networks, especially in case of ELs. In this paper, we propose a novel error concealment method for the SHVC EL with an assumption that the BL is well protected. First, we recover the partitioning and resample the motion data from collocated BL frame. Following, we remove outliers of motion field by a motion vector refinement algorithm. Lastly, we conceal loss frame by using motion compensation and deblocking filter. Experiments conducted with a rich set of test sequences and for the spatial-scalable SHVC standard have shown that our proposed method significantly outperforms the relevant error concealment methods, e.g., BL Reconstruction Up-sampling (RU) and BL-SKIP in both subjective and objective quality assessments.Huu Thuc Nguyenthuckechsu02@gmail.comCanh Thuong NguyenNgcthuong@gmail.comVan Xiem Hoangxiemhoang@vnu.edu.vnJeon Byeungwoobjeon@skku.edu2019-05-27T08:30:27Z2019-05-27T08:30:27Zhttp://eprints.uet.vnu.edu.vn/eprints/id/eprint/3452This item is in the repository with the URL: http://eprints.uet.vnu.edu.vn/eprints/id/eprint/34522019-05-27T08:30:27ZA LOW COMPLEXITY WYNER-ZIV CODING SOLUTION FOR LIGHT FIELD IMAGE TRANSMISSION AND STORAGECompressing Light Field (LF) imaging data is a challenging but very important task for both LF image transmission and storage applications. In this paper, we propose a novel coding solution for LF images using the well-known Wyner-Ziv information theorem. First, the LF image is decomposed into 4D LF data format. Using a spiral scanning mechanismprocedure, a pseudo-sequence of LF 4D images is generated. This sequence is then compressed in a distributed coding manner as specified in the Wyner-Ziv theorem. In this context, low computational complexity can be achieved at the encoder since the high computational complexity motion estimation part is shifted to the decoderá commonly. In addition, we introduce a novel adaptive frame skipping algorithm to further explore the high correlation between 4D LF images. Experimental results show that the proposed WZ coding based LF image is able to achieve a significant compression gain, notably around 54% bitrate saving when compared with the standard High Efficiency Video Coding (HEVC) Intra benchmark while requiring less computational complexity.Cong Huy Phihuypc@ptit.edu.vnPerry StuartStuart.Perry@uts.edu.auVan Xiem Hoangxiemhoang@vnu.edu.vn2019-05-27T08:29:52Z2019-06-03T04:11:31Zhttp://eprints.uet.vnu.edu.vn/eprints/id/eprint/3454This item is in the repository with the URL: http://eprints.uet.vnu.edu.vn/eprints/id/eprint/34542019-05-27T08:29:52ZA Novel Fusion Method for 3D-TV View Synthesis Using Temporal and Disparity CorrelationsView synthesis like Depth-image-based-rendering (DIBR) plays a significant role in 3D content creation for 3D-TV. However, perceptual errors introduced by current view synthesis often result in severe distortions in synthesized images. In this paper, we propose a novel view synthesis fusion (VSF) method which adaptively exploits temporal and disparity correlations to improve the quality of the synthesized picture. The proposed VSF method defines a robust correlation assessment metric for fusing several pre-created virtual view candidates. Unlike conventional methods, the proposed fusion algorithm is applied for both hole and non-hole areas. Experimental results show significantly outperforming peak signal-to noise ratio (PSNR) and subjective visual quality by the proposed method compared to other conventional methods.Trieu Duong Dinhduongdt@vnu.edu.vnMinh Le Dinhminhle2994@gmail.comJeon Byeungwoobjeon@skku.eduVan Xiem Hoangxiemhoang@vnu.edu.vn2019-05-27T08:29:21Z2019-06-03T04:11:07Zhttp://eprints.uet.vnu.edu.vn/eprints/id/eprint/3455This item is in the repository with the URL: http://eprints.uet.vnu.edu.vn/eprints/id/eprint/34552019-05-27T08:29:21ZA Novel Consistent Quality Driven for JEM based Distributed Video CodingDistributed video coding (DVC) is an attractive and promising solution for low complexity constrained video applications, such as wireless sensor networks or wireless surveillance systems. In DVC, the visual quality consistency is one of the most important issues to evaluate the performance of a DVC codec. However, it is the fact that the quality of the decoded frames achieved in most recent DVC codecs is not consistent and it is varied with high quality fluctuation. To solve the problem, in this paper, we propose a novel DVC solution named JEM based DVC (JEM-DVC), which can provide not only higher performance compared to the traditional DVC solutions but also effective scheme for the quality consistency control. In the proposed JEM-DVC solution, we first employ several advanced techniques provided in the Joint exploration model (JEM) of the future video coding standard (FVC) to effectively improve the performance of JEM-DVC codec. Then, for consistent quality control, we propose two novel methods named key frame quantization (KF-Q) and Wyner-Zip frame quantization (WZF-Q) which determine the optimal values of quantization parametter (QP) and quantization matrix (QM) applied for the key and WZ frame coding, respectively. Unlike the conventional approaches, the optimal values of QP and QM are adaptively controlled and updated for every key and WZ frames to guarantee the consistent video quality for the proposed codec. Our proposed JEM-DVC is the first DVC codec in literature employing JEM coding technique, then all results presented in this paper are new. Experimental results show that the proposed JEM-DVC significantly outperforms the relevant DVC benchmarks, notably the DISCOVER DVC and the recent H.265/HEVC based DVC, in terms of both Peak signal-to-noise ratio (PSNR) performance and consistent visual quality.Trieu Duong Dinhduongdt@vnu.edu.vnCong Huy Phihuypc@ptit.edu.vnVan Xiem Hoangxiemhoang@vnu.edu.vn2019-05-27T08:28:49Z2019-05-27T08:28:49Zhttp://eprints.uet.vnu.edu.vn/eprints/id/eprint/3456This item is in the repository with the URL: http://eprints.uet.vnu.edu.vn/eprints/id/eprint/34562019-05-27T08:28:49ZCooperative Caching in Two-Layer Hierarchical Cache aided SystemsCaching has received much attention as a promising technique to overcome high data rate and stringent latency requirements in the future wireless networks. The premise of caching technique is to prefetch most popular contents closer to end users in local cache of edge nodes, e.g., base station (BS). When a user requests a content that is available in the cache, it can be served directly without being sent from the core network. In this paper, we investigate the performance of hierarchical caching systems, in which both BS and end users are equipped with a storage memory. In particular, we propose a novel cooperative caching scheme that jointly optimizes the content placement at the BS’s and users’ caches. The proposed caching scheme is analytically shown to achieve a larger global caching gain than the reference in both uncoded – and coded caching strategies. Finally, numerical results are presented to demonstrate the effectiveness of our proposed caching algorithm.Van Xiem Hoangxiemhoang@vnu.edu.vnthi Hang duonghangdt@haui.edu.vnAnh Vu Trinhvuta@vnu.edu.vnXuan Thang Vuthang.vu85@gmail.com2018-12-20T06:04:47Z2018-12-20T06:04:47Zhttp://eprints.uet.vnu.edu.vn/eprints/id/eprint/3289This item is in the repository with the URL: http://eprints.uet.vnu.edu.vn/eprints/id/eprint/32892018-12-20T06:04:47ZView Synthesis Method for 3D Video Coding Based on Temporal and Inter ViewTung Long VuongDinh Minh Leminhld_57@vnu.edu.vnVan Xiem Hoangxiemhoang@vnu.edu.vnTrieu Duong Dinhduongdt@vnu.edu.vnHuu Tien VuThanh Ha Leltha@vnu.edu.vn2018-12-18T02:57:41Z2018-12-18T02:57:41Zhttp://eprints.uet.vnu.edu.vn/eprints/id/eprint/3336This item is in the repository with the URL: http://eprints.uet.vnu.edu.vn/eprints/id/eprint/33362018-12-18T02:57:41ZComplexity Controlled Side Information Creation for Distributed Scalable Video CodingDistributed scalable video coding (DSVC) has recently been gaining many attentions due to its benefits in terms of computational complexity, error resilience and scalability, which are important for emerging video applications like wireless sensor networks and visual surveillance system (VSS). In DSVC, the side information (SI) creation plays a key role as it directly affects to the DSVC compression performance and the encoder/decoder computational complexity. However, for many VSS applications, the energy of each VSS node is usually attenuating along the time, making the difficulty in transmitting surveillance video in real time. To address this problem, we propose a complexity controlled SI creation solution for the newly DSVC framework. To achieve the flexible SI creation, the complexity associated to SI creation process is modeled using a linear model in which the model parameters are estimated from a fitting process. To adjust the SI complexity, a user parameter is defined based on the availability of the VSS energy resource. Experiments conducted for a rich set of video surveillance data have revealed the benefits of the proposed complexity control solution, notably in both complexity control and compression performance.Quang Hoang Vanquanghvdt@gmail.comLe Dao Thi Huehueledao94@gmail.comVien Du dinhdudinhvien@gmail.comVu Nguyen Hongvu.nguyenhong@gmail.comVan Xiem Hoangxiemhoang@vnu.edu.vn2018-12-17T02:52:03Z2018-12-17T02:52:03Zhttp://eprints.uet.vnu.edu.vn/eprints/id/eprint/3290This item is in the repository with the URL: http://eprints.uet.vnu.edu.vn/eprints/id/eprint/32902018-12-17T02:52:03ZCoding distortion modelling method for local image perceptionThanh Tung PhamTrieu Duong Dinhduongdt@vnu.edu.vnVan Xiem Hoangxiemhoang@vnu.edu.vnHuu Tien VuThanh Ha Leltha@vnu.edu.vn2018-12-14T01:42:08Z2018-12-18T07:02:04Zhttp://eprints.uet.vnu.edu.vn/eprints/id/eprint/3249This item is in the repository with the URL: http://eprints.uet.vnu.edu.vn/eprints/id/eprint/32492018-12-14T01:42:08ZCooperative Caching in Two-Layer Hierarchical Cache-aided SystemsCaching has received much attention as a promising technique to overcome high data rate and stringent latency requirements in the future wireless networks. The premise of caching technique is to prefetch most popular contents closer to end users in local cache of edge nodes, e.g., base station (BS). When a user requests a content that is available in the cache, it can be served directly without being sent from the core network. In this paper, we investigate the performance of hierarchical caching systems, in which both BS and end users are equipped with a storage memory. In particular, we propose a novel cooperative caching scheme that jointly optimizes the content placement at the BS’s and users’ caches. The proposed caching scheme is analytically shown to achieve a larger global caching gain than the reference in both uncoded – and coded caching strategies. Finally, numerical results are presented to demonstrate the effectiveness of our proposed caching algorithmVan Xiem Hoangxiemhoang@vnu.edu.vnHang DuongAnh Vu Trinhvuta@vnu.edu.vnXuan Thang Vuthang.vu85@gmail.com2018-11-20T08:58:20Z2018-11-21T09:07:21Zhttp://eprints.uet.vnu.edu.vn/eprints/id/eprint/3088This item is in the repository with the URL: http://eprints.uet.vnu.edu.vn/eprints/id/eprint/30882018-11-20T08:58:20ZJoint Layer Prediction for Improving SHVC Compression Performance and Error ConcealmentScalable High Efficiency Video Coding (SHVC) standard is expected to play a more important role in the heterogeneous landscape of broadcasting, multimedia, networks, and various services applications as it is specified as a layered coding technique in the ATSC (Advanced Television Systems Committee) 3.0. However, its block-based structure of temporal and spatial prediction makes it sensitive to information loss and error propagation due to transmission errors. In this context, we propose an improved SHVC with a joint layer prediction (JLP) solution which adaptively combines the decoded information from the base and the enhancement layers to create an additional reference for the SHVC enhancement encoder. To optimize the quality of the joint prediction, the minimum mean square error (MMSE) estimation is executed in computing a combination factor which gives weights to each contribution of the decoded information from the layers. In addition, the proposed JLP is integrated into the SHVC decoder to work as an error concealment solution to mitigate the error propagation happening inevitably in practical video transmission. Experiments have shown that the proposed SHVC framework significantly outperforms its relevant benchmarks, notably by up to 14.8% in bitrate reduction with respect to the standard SHVC codec. The proposed SHVC error concealment strategy also greatly improves the concealed picture quality as well as reducing the problem of error propagation when compared to conventional error concealment approaches.Van Xiem Hoangxiemhoang@vnu.edu.vnJeon Byeungwoobjeon@skku.edu2018-10-09T09:47:25Z2018-10-09T09:47:25Zhttp://eprints.uet.vnu.edu.vn/eprints/id/eprint/3086This item is in the repository with the URL: http://eprints.uet.vnu.edu.vn/eprints/id/eprint/30862018-10-09T09:47:25ZArtificial Intelligence Based Adaptive GOP Size Selection for Effective Wyner-Ziv Video CodingWyner-Ziv video coding (WZVC) has been gaining
many attentions in recent decades due to its low computational complexity and error resiliency benefits, notably when compared to traditional video coding standards such as H.264/AVC or High Efficiency Video Coding (HEVC) standards. In a WynerZiv video coding scheme, the compression efficiency can be controlled by the length of the group of pictures (GOP) which typically consists of the two key and several WZ frames. However,the current Wyner-Ziv video coding solutions usually employ a fixed GOP size or simple adaptive GOP size mechanisms, which depend on some heuristic features extracted from video content.
To address the limitation of the current GOP size adaptation solutions, we propose in this paper a novel Artificial Intelligence based GOP size adaptation mechanism and integrate it into the most advanced transform domain Wyner-Ziv video coding (TDWZ) architecture. In the proposed GOP size adaptation mechanism, the proper GOP size is learnt from the correlation between video features and the optimal compression performance. The power of machine learning techniques is used to select the most suitable video features and the model of GOP size and compression performance correlation. Experimental results shown that, using the obtained GOP size adaptation mechanism, the TDWZ achieved a compression performance when compared
to relevant benchmarks.Thi Huong Thao NguyenCong Huy Phihuypc@ptit.edu.vnHuu Tien VuVan Xiem Hoangxiemhoang@vnu.edu.vn2017-10-29T06:59:34Z2017-10-29T06:59:34Zhttp://eprints.uet.vnu.edu.vn/eprints/id/eprint/2591This item is in the repository with the URL: http://eprints.uet.vnu.edu.vn/eprints/id/eprint/25912017-10-29T06:59:34ZBase Layer Constrained Error Concealment Solutions for Robust SHVC Video TransmissionConsidering for a powerful scalable video coding solution, not only in error-free but also in error-prone environment, this paper proposes two error concealment (EC) solutions, which mainly rely on the base layer available information. The proposed error concealment solutions are integrated at the decoder side of the most recent scalable high efficiency video coding (SHVC) standard. The proposed EC solutions are adaptively performed with the coding structure of the SHVC standard, notably the quad-tree division and the high-level syntax approach. Experiments conducted for a rich set of test sequences and conditions have shown the advances of the proposed EC concealments, notably with around 4 dB concealed frame quality improvement when compared to the conventional frame copy approach.Van Xiem Hoangxiemhoang@vnu.edu.vnTrieu Duong Dinhduongdt@vnu.edu.vnVu Huu Tientienvh@ptit.edu.vnNguyen Huu Thuc2017-10-29T06:59:30Z2017-11-06T02:27:51Zhttp://eprints.uet.vnu.edu.vn/eprints/id/eprint/2589This item is in the repository with the URL: http://eprints.uet.vnu.edu.vn/eprints/id/eprint/25892017-10-29T06:59:30ZA Novel Content Adaptive Search Strategy for Low Complexity Frame Rate Up ConversionFrame rate up-conversion (FRUC) is an important technique for various film/video conversions, and technology displays due to its benefits on both increasing the viewing quality experiences and reducing the cost of video transmission. However, with the increasing of video resolutions and the exceeding computation associated to the motion estimation (ME) stage, FRUC is hardly suitable for real-time video applications. In this context, we propose a novel ME search strategy solution for low complexity, yet effective FRUC framework. In the proposed FRUC framework, the search strategy, one of the major aspects which directly influences to the FRUC processing time as well as the interpolated frame quality, is adaptively driven by the video content. Both temporal and spatial activities are considered to adjust the number of searching points according to the minimum of mean absolute difference (MAD) between current and reference blocks. Experimental results conducted for a rich set of video sequences shown the advantages of the proposed FRUC scheme, notably on both interpolated frame quality and time processing when comparing to relevant benchmarks.Van Xiem Hoangxiemhoang@vnu.edu.vnPhi Cong Huyhuypc@ptit.edu.vn2017-10-29T06:59:24Z2017-12-05T06:25:17Zhttp://eprints.uet.vnu.edu.vn/eprints/id/eprint/2588This item is in the repository with the URL: http://eprints.uet.vnu.edu.vn/eprints/id/eprint/25882017-10-29T06:59:24ZHEVC based Distributed Scalable Video Coding for Surveillance Visual SystemSurveillance visual systems play an important role in modern life, especially in the Internet of Things (IoTs) era. However, the limitation of bandwidth, energy resources and the heterogeneity of devices, networks and environments have been asking for a more powerful video coding solution, which provides not only the high compression efficiency but also the flexible scalability capability. In this context, we propose a novel scalable video coding solution, particularly designed for surveillance video content, which typically contains low motion and static scenes, thus having high temporal redundancy. In the proposed video coding framework, the conventional video coding standard, i.e., High Efficiency Video Coding (HEVC) is wisely combined with the emerging distributed coding paradigm and following a lay-ered coding approach to exploit the high temporal correlation between frames in surveillance video content. As assessed, the proposed surveillance distributed scalable video coding solution significantly outperforms the relevant coding benchmarks, nota-bly with around 36,8% bitrate saving in average when compared to the HEVC simulcasting benchmark.Van Xiem Hoangxiemhoang@vnu.edu.vnThi Hue Le Daohueledao94@gmail.comTrieu Duong Dinhduongdt@vnu.edu.vn2017-10-29T03:19:37Z2017-10-29T03:19:37Zhttp://eprints.uet.vnu.edu.vn/eprints/id/eprint/2590This item is in the repository with the URL: http://eprints.uet.vnu.edu.vn/eprints/id/eprint/25902017-10-29T03:19:37ZImproving SHVC Performance with a Block based Joint Layer Prediction SolutionConsidering for the need of a more powerful scalable video coding solution beyond the recent Scalable High Efficiency Video Coding (SHVC) standard, this paper proposes a novel joint layer prediction creation solution. In the proposed improvement solution, the temporal correlation between frames is exploited through a motion compensated temporal interpolation (MCTI) mechanism. The MCTI frame is then adaptively combined with the base layer reconstruction using a linear combination algorithm. In this combination, a weighting factor is defined and computed for each predicted block using the estimated errors associated to each input. Finally, to achieve the highest compression efficiency, the fused frame is treated as an additional reference and adaptively selected using a rate distortion optimization (RDO) mechanism. Experiments conducted for a rich set of test conditions have shown that significant compression efficiency gains can be achieved with the proposed improvement solution, notably by up to 4.5 % in enhancement layer BD-Rate savings regarding the standard SHVC quality scalable codec.Van Xiem Hoangxiemhoang@vnu.edu.vn2017-10-29T03:16:53Z2017-12-17T08:24:10Zhttp://eprints.uet.vnu.edu.vn/eprints/id/eprint/2587This item is in the repository with the URL: http://eprints.uet.vnu.edu.vn/eprints/id/eprint/25872017-10-29T03:16:53ZJoint Exploration Model based Light Field Image Coding: A Comparative StudyThe recent light field imaging technology has been attracting a lot of interests due to its potential applications in a large number of areas including Virtual Reality, Augmented Reality (VR/AR), Teleconferencing, and E-learning. Light Field (LF) data is able to provide rich visual information such as scene rendering with changes in depth of field, viewpoint, and focal length. However, Light Field data usually associates to a critical problem - the massive data. Therefore, compressing LF data is one of the main challenges in LF research. In this context, we present in this paper a comparative study for compressing LF data with not only the widely used image/video coding standards, such as JPEG-2000, H.264/AVC, HEVC and Google/VP9 but also with the most recent image/video coding solution, the Joint Exploration Model. In addition, this paper also proposes a LF image coding flow, which can be used as a benchmark for future LF compression evaluation. Finally, the compression efficiency of these coding solutions is thoroughly compared throughout a rich set of test conditions.Cong Huy Phihuypc@ptit.edu.vnStuart PerryStuart.Perry@uts.edu.auAnh Vu Trinhvuta@vnu.edu.vnVan Xiem Hoangxiemhoang@vnu.edu.vn2017-10-29T03:16:22Z2017-10-29T03:16:22Zhttp://eprints.uet.vnu.edu.vn/eprints/id/eprint/2586This item is in the repository with the URL: http://eprints.uet.vnu.edu.vn/eprints/id/eprint/25862017-10-29T03:16:22ZA statistical search range adaptation solution for effective frame rate up - conversionThe recent development of advanced television systems has demonstrated a need for an efficient video conversion technique. In this scenario, frame rate up conversion (FRUC) solutions play an important role due to their benefits in both increasing the viewing quality experience and reducing the cost of video transmission. However, with the recent increase in video resolution, notably from Standard Definition (SD) to High Definition (HD) and ultra HD, FRUC now requires not only better interpolated frame quality but also lower FRUC time processing. Considering this problem, this paper proposes a novel statistical learning based adaptive search range solution to enable an effective FRUC mechanism. In the proposed adaptive search range solution, a set of spatial-temporal features are carefully defined and exploited to adaptively assign an appropriate search range value to each considered block, notably by formulating the search range adaptation as a classification problem and using the well-known support vector machine framework for the classification task. Experimental results conducted for a rich set of common video test sequences shows the advantages of the proposed adaptive search range solution, notably in both interpolated frame quality improvement and time processing reduction.Van Xiem Hoangxiemhoang@vnu.edu.vn2017-10-29T03:11:24Z2017-12-18T08:32:20Zhttp://eprints.uet.vnu.edu.vn/eprints/id/eprint/2570This item is in the repository with the URL: http://eprints.uet.vnu.edu.vn/eprints/id/eprint/25702017-10-29T03:11:24ZAn Online SVM based Side Information Creation for Efficient Distributed Scalable Video CodingWith the significant increase of the network heterogeneity and the wide use of emerging video applications such as wireless sensor networks, video surveillance systems or remote sensing, the Distributed Scalable Video Coding (DSVC) is a potential solution for efficiently transmitting and storing video data due to its high compression efficiency and low encoding complexity capabilities. In DSVC framework, Side Information (SI), created at the decoder side by exploiting the temporal and inter-layer correlations between decoded frames, plays an important role as it directly affects to the final DSVC coding performance. Therefore, this paper proposes a novel SI creation solution which explicitly formulates the SI creation as a classification problem and employs an online learning Support Vector Machine (SVM) engine to fuse several SI candidates. Experiments conducted for a rich set of test sequences show that the proposed SI creation solution significantly outperforms the previous DSVC SI creation methods in terms of SI quality while slightly introducing the computational complexity.Van Xiem Hoangxiemhoang@vnu.edu.vnNguyen Thi Huong Thaothaotb07@gmail.com2016-12-12T16:48:54Z2016-12-12T16:48:54Zhttp://eprints.uet.vnu.edu.vn/eprints/id/eprint/2036This item is in the repository with the URL: http://eprints.uet.vnu.edu.vn/eprints/id/eprint/20362016-12-12T16:48:54ZSide information creation using adaptive block size for distributed video codingThi Huong Thao NguyenHuu Tien VuVan San VuVan Xiem Hoangxiemhoang@vnu.edu.vnThanh Ha Leltha@vnu.edu.vnTrieu Duong Dinhduongdt@vnu.edu.vn2016-12-12T03:59:27Z2016-12-12T03:59:27Zhttp://eprints.uet.vnu.edu.vn/eprints/id/eprint/2035This item is in the repository with the URL: http://eprints.uet.vnu.edu.vn/eprints/id/eprint/20352016-12-12T03:59:27ZSpatial - Temporal Feature Extraction based Adaptive Search Range for Effective Frame Rate Up - ConversionVan Xiem Hoangxiemhoang@vnu.edu.vnDuong Trieu Dinhduongdt@vnu.edu.vnThanh Ha Leltha@vnu.edu.vn2016-12-08T03:40:02Z2016-12-08T03:40:02Zhttp://eprints.uet.vnu.edu.vn/eprints/id/eprint/1992This item is in the repository with the URL: http://eprints.uet.vnu.edu.vn/eprints/id/eprint/19922016-12-08T03:40:02ZSpatial-Temporal Feature Extraction based Adaptive Search Range for Effective Frame Rate-Up ConversionFrame rate up conversion (FRUC) has been playing an important role in the recent development of advanced television systems due to its benefits on both increasing the viewing quality experiences and reducing the cost of video transmission. However, with the increasing of video resolutions, notably from Standard Definition (SD) to High Definition (HD), FRUC is now asked to provide not only better interpolated frames quality but also lower time processing. Therefore, in this paper, we propose a novel spatial – temporal feature extraction based adaptive search range for effective FRUC. In the proposed adaptive search range scheme, a set of temporal and spatial features are carefully defined and exploited to adaptively assign an appropriate search range value to each considered block; thus, directly reducing the FRUC time processing. Moreover, since the optimal search range can be employed; the quality of interpolated frames is significantly improved. Experimental results conducted for a rich set of video test sequences shown the advantages of the proposed FRUC scheme, notably in both subjective objective image quality improvement and time processing reduction.Van Xiem Hoangxiemhoang@vnu.edu.vnTrieu Duong Dinhduongdt@vnu.edu.vnThanh Ha Leltha@vnu.edu.vn2016-12-01T06:07:44Z2016-12-01T06:08:00Zhttp://eprints.uet.vnu.edu.vn/eprints/id/eprint/1998This item is in the repository with the URL: http://eprints.uet.vnu.edu.vn/eprints/id/eprint/19982016-12-01T06:07:44ZImproving 3D-TV View Synthesis Using Motion Compensated Temporal InterpolationNowadays, the development of three-dimension (3D) video applications such as three-dimensional television (3D-TV) and free-viewpoint television (FTV) has greatly increased human experiences. View synthesis method like depth-image-based-rendering (DIBR), plays a significant role in 3D content creation, 3D transmission, and has been integrated into video coding standards such as 3D-High efficiency video coding (3D-HEVC). However, the current DIBR method employs only the disparity correlation between views to create a so-called synthesized view; thus, unable to take full advantages of available synthesized information. In this paper, we propose a novel view synthesis method which takes advantages of not only the disparity correlation but also the temporal correlation between views. In the proposed method, an effective motion compensation based frame interpolation is employed to generate a temporal prediction view which is then combined with the DIBR rendered view to obtain the final synthesized view. Experimental results show that the proposed method can achieve the synthesized view with significantly outperforming other conventional techniques in terms of both peak signal-to noise ratio (PSNR) and subjective visual quality.Dinh Minh Leminhld_57@vnu.edu.vnTung Long VuongVan Xiem Hoangxiemhoang@vnu.edu.vnTrieu Duong Dinhduongdt@vnu.edu.vnThanh Ha Leltha@vnu.edu.vn2016-05-28T03:57:33Z2017-01-06T09:25:05Zhttp://eprints.uet.vnu.edu.vn/eprints/id/eprint/1689This item is in the repository with the URL: http://eprints.uet.vnu.edu.vn/eprints/id/eprint/16892016-05-28T03:57:33ZImproving SHVC Performance with a Joint Layer Coding ModeThe growing need for a powerful scalable video coding engine targeting the heterogeneous landscape of network, devices, and consumption environments has led to the development of the Scalable High Efficiency Video Coding (SHVC) standard, an extension of the High Efficiency Video Coding (HEVC) standard. To improve the SHVC compression efficiency, this paper proposes a novel joint layer coding mode to be integrated in the SHVC codec. In the proposed coding mode, the base layer (BL) and enhancement layer (EL) decoded information are linearly combined at the pixel level to create an additional coding mode. To fuse the BL and EL driven predictions, a weighting term is defined to indicate the contributions of each of them for the final joint layer prediction. To reach high adaptability, these weights are computed at pixel level in the prediction unit. Moreover, to achieve the highest compression efficiency, the proposed joint layer coding mode is adaptively selected using a rate distortion optimization (RDO) mechanism. Experiments conducted for a rich set of test conditions have shown that significant compression efficiency gains can be achieved with the proposed joint layer coding mode, notably up to 4.3 % in BD-Rate savings regarding the standard SHVC quality scalable codec.Van Xiem Hoangxiemhoang@vnu.edu.vnJoao AscensoA.Joao@gmail.com