VNU-UET Repository: No conditions. Results ordered -Date Deposited.

VNU-UET Repository: No conditions. Results ordered -Date Deposited. 2024-07-27T16:40:13Z EPrints http://eprints.uet.vnu.edu.vn/images/sitelogo.png https://eprints.uet.vnu.edu.vn/eprints/ 2021-12-10T10:58:19Z 2021-12-10T10:58:19Z http://eprints.uet.vnu.edu.vn/eprints/id/eprint/4661 This item is in the repository with the URL: http://eprints.uet.vnu.edu.vn/eprints/id/eprint/4661 2021-12-10T10:58:19Z MAI_ARM: Robot Tay Máy Thông Minh sử dụng Trí tuệ nhân tạo Đa thể thức

Các hệ thống robot tay máy đóng một vai trò rất quan trọng trong sản xuất công nghiệp hiện đại. Việc ứng dụng robot tay máy trong đời sống hàng ngày sẽ giúp ích con người rất nhiều, ví dụ như tự động hóa hoàn toàn quy trình phân phát thuốc và thực phẩm cho bệnh nhân nhiễm COVID-19. Ứng dụng robot tay máy trong đời sống đòi hỏi một cách thức giúp người dùng dễ dàng tương tác với robot. Trong bài báo này, một phương pháp tương tác với robot tay máy sử dụng trí tuệ nhân tạo đa thể thức kết hợp giọng nói-hình ảnh được xây dựng. Bài báo cũng đề xuất phương pháp Chessboard Calibration giúp nâng cao độ chính xác trong việc xác định vị trí thực thi của robot tay máy. Một robot tay máy 4 bậc tự do (4-DOF) chế tạo bằng phương pháp in 3D sẽ được sử dụng để thực thi và đánh giá mô hình trí tuệ nhân tạo đa thể thức xây dựng được. Kết quả thực thi hệ thống được ghi hình và trình bày tại: https://youtu.be/RHgAyHXMH6I

Bao Minh Dinh minhdinh@vnu.edu.vn Duc Son Tran transonhhc12c5@gmail.com The Huong Nguyen huongrbe@gmail.com Nam Do donam.2801@gmail.com Minh Hoang Le lhoang17062000@gmail.com Van Xiem Hoang xiemhoang@vnu.edu.vn 2021-12-10T10:58:14Z 2021-12-10T10:58:14Z http://eprints.uet.vnu.edu.vn/eprints/id/eprint/4660 This item is in the repository with the URL: http://eprints.uet.vnu.edu.vn/eprints/id/eprint/4660 2021-12-10T10:58:14Z Đánh Giá và Tối Ưu Thuật Toán Hector SLAM Ứng Dụng Lập Bản Đồ và Định Vị Trên Pimouse Robot

Lập bản đồ và định vị là hai trong bốn bài toán cơ bản của hệ thống Robot di động. Hai cách tiếp cận phổ biến hiện nay để giải quyết bài toán này là sử dụng hệ thống LiDAR hoặc/và hệ thống cảm biến hình ảnh cùng các thuật toán xử lý dữ liệu thu được. Hướng tiếp cận với LiDAR và thuật toán Hector SLAM cho kết quả tạo bản đồ với độ chính xác cao, nhưng đòi hỏi phải tối ưu các tham số của thuật toán. Để hiểu rõ vấn đề này, chúng tôi nghiên cứu và đánh giá các tham số chính ảnh hưởng tới hiệu năng thực thi của thuật toán Hector SLAM cho một hệ thống Robot di động sử dụng LIDAR để lập bản đồ và định vị. Hiệu năng của hệ thống được đánh giá trên hai khía cạnh: i) chất lượng của bản đồ thu được và ii) lượng CPU chiếm dụng. Với việc hiểu rõ ảnh hưởng của các tham số của thuật toán Hector SLAM tới hiệu năng của hệ thống, người dùng có thể thay đổi linh hoạt các tham số này tùy vào Robot sử dụng. Kết quả nghiên cứu được minh họa trên một hệ thống Robot di động được phát triển bởi công ty RT corporation, Nhật bản, Pimouse Robot.

Bao Minh Dinh minhdinh@vnu.edu.vn Anh Viet Dang vietda@vnu.edu.vn Canh Thanh Nguyen canhthanhlt@gmail.com Van Xiem Hoang xiemhoang@vnu.edu.vn 2021-12-10T10:58:11Z 2021-12-10T10:58:11Z http://eprints.uet.vnu.edu.vn/eprints/id/eprint/4659 This item is in the repository with the URL: http://eprints.uet.vnu.edu.vn/eprints/id/eprint/4659 2021-12-10T10:58:11Z Fast QTMT for H.266/VVC Intra Prediction using Early-Terminated Hierarchical CNN model

Versatile Video Coding (VVC) has been standardization in July 2020. Compared to previous High Efficiency Video Coding (HEVC) standard, VVC saves up to 50% bitrate for equal perceptual video quality. To reach this efficiency, Joint Video Experts Team (JVET) has introduced a number of improvement techniques to VVC model. As a result, the complexity of VVC encoding also greatly increases. One of the new techniques affects to the growing of complexity is the quad-tree nested multi-type tree (QTMT) including binary split and ternary splits, which lead to a block in VVC with various shapes in both square and rectangle. Based on the aforementioned information we propose in this paper a new deep learning based fast QTMT method. We use a learned convolutional neural network (CNN) model namely EarlyTerminated Hierarchical CNN to predict the coding unit map and then fed into the VVC encoder to early terminate the block partitioning process. Experimental results show that the proposed method can save 30.29% encoding time with a negligible BD-Rate increase.

Van Xiem Hoang xiemhoang@vnu.edu.vn Quang Sang Nguyen ngsang998@gmail.com Bao Minh Dinh minhdinh@vnu.edu.vn Ngoc Minh Do ngocminhc2nc1@gmail.com Trieu Duong Dinh duongdt@vnu.edu.vn 2020-12-08T15:30:57Z 2020-12-08T15:30:57Z http://eprints.uet.vnu.edu.vn/eprints/id/eprint/4203 This item is in the repository with the URL: http://eprints.uet.vnu.edu.vn/eprints/id/eprint/4203 2020-12-08T15:30:57Z Cải tiến thuật toán TZ Search cho tăng tốc mô hình mã hóa H.266/Versatile Video Coding

Cải tiến các chuẩn mã hóa video đang được quan tâm nhiều trong thời gian gần đây nhằm đáp ứng nhu cầu ngày càng cao của các ứng dụng truyền thông đa phương tiện. Cho đến thời điểm hiện tại, chuẩn mã hóa video mới nhất là chuẩn H.266/VVC (Versatile Video Coding). Với những nỗ lực cải tiến, chuẩn H.266/VVC đạt được lượng bit tiết kiệm lên đến 50% khi so sánh với chuẩn mã hóa video phổ biến H.265/HEVC (High Efficiency Video Coding) trong khi vẫn đảm bảo chất lượng video sau giải mã không đổi. Tuy nhiên, để đạt được hiệu năng mã hóa cao như vậy, chuẩn H.266/VVC yêu cầu thời gian mã hóa gấp 5-30 lần so với chuẩn H.265/HEVC. Nguyên nhân chính đến từ việc phải tìm kiếm khối phù hợp trong một không gian rộng lớn và nhiều trường hợp tìm kiếm hơn. Để giải quyết vấn đề này, bài báo đề xuất một thuật toán cải tiến tìm kiếm nhanh TZ-Search (Test Zone Search) với khả năng tăng tốc độ mã hóa tốt hơn khi dùng trong chuẩn H.266/VVC. Kết quả đánh giá cho thấy, thuật toán TZ-Search cải tiến có thể giúp giảm thời gian mã hóa video H.266/VVC tới 12,6% so với TZ-Search gốc, trong khi vẫn đảm bảo được hiệu năng mã hóa cao.

Thanh Huong Bui Quang Sang Nguyen Trieu Duong Dinh Duc Trinh Chu Van Xiem Hoang 2020-07-18T02:47:19Z 2020-07-18T02:47:19Z http://eprints.uet.vnu.edu.vn/eprints/id/eprint/4027 This item is in the repository with the URL: http://eprints.uet.vnu.edu.vn/eprints/id/eprint/4027 2020-07-18T02:47:19Z Improving performance of distributed video coding by consecutively refining of side information and correlation noise model

Distributed video coding (DVC) is built on distributed source coding (DSC) principles where the video statistics are exploited, partly or fully, at the decoder instead of the encoder. In theory, DVC scheme is proved that there is no performance loss when compared to predictive video coding. However, its practical implementation has a large gap to achieve the theoretically optimum performance. The DVC coding efficiency depends mainly on creating the side information (SI) - a noisy version of original Wyner-Ziv frame (WZF) at the decoder, and modeling the correlation noise - the difference between the original WZF and corresponding SI. Performance of the DVC scheme will be improved if the SI and correlation noise are estimated as accurately as possible. So, this paper proposes a method to enhance the quality of SI and also correlation noise model by using information in decoded WZFs during the decoding …

Tien Vu Huu tienvh@ptit.edu.vn thao Nguyen thi huong thaotb07@gmail.com Minh Nguyen ngoc Van Xiem Hoang xiemhoang@vnu.edu.vn 2020-07-18T02:46:31Z 2020-07-18T02:46:31Z http://eprints.uet.vnu.edu.vn/eprints/id/eprint/4026 This item is in the repository with the URL: http://eprints.uet.vnu.edu.vn/eprints/id/eprint/4026 2020-07-18T02:46:31Z Adaptive Quantization Parameter Estimation for HEVC Based Surveillance Scalable Video Coding

Visual surveillance systems have been playing a vital role in human modern life with a large number of applications, ranging from remote home management, public security to traffic monitoring. The recent High Efficiency Video Coding (HEVC) scalable extension, namely SHVC, provides not only the compression efficiency but also the adaptive streaming capability. However, SHVC is originally designed for videos captured from generic scenes rather than from visual surveillance systems. In this paper, we propose a novel HEVC based surveillance scalable video coding (SSVC) framework. First, to achieve high quality inter prediction, we propose a long-term reference coding method, which adaptively exploits the temporal correlation among frames in surveillance video. Second, to optimize the SSVC compression performance, we design a quantization parameter adaptation mechanism in which the relationship between SSVC rate-distortion (RD) performance and the quantization parameter is statistically modeled by a fourth-order polynomial function. Afterwards, an appropriate quantization parameter is derived for frames at long-term reference position. Experiments conducted for a common set of surveillance videos have shown that the proposed SSVC significantly outperforms the relevant SHVC standard, notably by around 6.9% and 12.6% bitrate saving for the low delay (LD) and random access (RA) coding configurations, respectively while still providing a similar perceptual decoded frame quality.

Van Xiem Hoang xiemhoang@vnu.edu.vn 2020-07-18T02:46:20Z 2020-07-18T02:46:20Z http://eprints.uet.vnu.edu.vn/eprints/id/eprint/4025 This item is in the repository with the URL: http://eprints.uet.vnu.edu.vn/eprints/id/eprint/4025 2020-07-18T02:46:20Z Improving TDWZ Correlation Noise Estimation: A Deep Learning based Approach

Transform domain Wyner-Ziv video coding (TDWZ) has shown its benefits in compressing video applications with limited resources such as visual surveillance systems, remote sensing and wireless sensor networks. In TDWZ, the correlation noise model (CNM) plays a vital role since it directly affects to the number of bits needed to send from the encoder and thus the overall TDWZ compression performance. To achieve CNM with high accurate for TDWZ, we propose in this paper a novel CNM estimation approach in which the CNM with Laplacian distribution is adaptively estimated based on a deep learning (DL) mechanism. The proposed DL based CNM includes two hidden layers and a linear activation function to adaptively update the Laplacian parameter. Experimental results showed that the proposed TDWZ codec significantly outperforms the relevant benchmarks, notably by around 35% bitrate saving when compared to the DISCOVER codec and around 22% bitrate saving when compared to the HEVC Intra benchmark while providing a similar perceptual quality.

Tien Vu Huu tienvh@ptit.edu.vn Nguyen Thi Huong Thao thaotb07@gmail.com San Vu van sanvv@ptit.edu.vn Van Xiem Hoang xiemhoang@vnu.edu.vn 2019-12-10T03:50:20Z 2019-12-10T03:50:20Z http://eprints.uet.vnu.edu.vn/eprints/id/eprint/3785 This item is in the repository with the URL: http://eprints.uet.vnu.edu.vn/eprints/id/eprint/3785 2019-12-10T03:50:20Z Phân tích, Đánh giá hiệu năng mã hóa video với chuẩn H.265/HEVC

Ngày nay, song song với sự ra đời của các loại video chất lượng cao (độ phân giải lên đến 8k×4k) thì nhu cầu về dịch vụ video đáp ứng được khả năng truyền dẫn (chưa thể tăng ngay về thông lượng) trên hạ tầng mạng hiện tại cũng như trên các thiết bị cá nhân hiện đại (như điện thoại di động, máy tính bảng) ngày càng gia tăng. Việc cải tiến các chuẩn mã hóa (nén) video là nhu cầu tất yếu đã và đang được quan tâm phát triển trong vài thập kỷ qua. Chuẩn mã hóa video hiện đại nhất cho đến thời điểm này là chuẩn H.265/HEVC (High Efficiency Video Coding) với nhiều cải tiến về hiệu quả mã hóa, tích hợp hệ thống truyền tải, khôi phục mất mát dữ liệu cũng như thực hiện kiến trúc xử lý song song. Trên cơ sở đó, bài báo trình bày kiến trúc tổng quan của chuẩn H.265/HEVC, phân tích những điểm mới, nổi bật so với các chuẩn mã hóa video cũ và mô phỏng đánh giá hiệu năng của chuẩn H.265/HEVC trên cơ sở hai yếu tố: hiệu năng nén và độ phức tạp thuật toán. Kết quả nghiên cứu sẽ góp phần thúc đẩy việc khai thác chuẩn H.265/HEVC trong các ứng dụng thực tế tại Việt Nam và trên thế giới.

Thanh Huong Bui huong1204@gmail.com Cong Huy Phi huypc@ptit.edu.vn Trieu Duong Dinh duongdt@vnu.edu.vn Van Xiem Hoang xiemhoang@vnu.edu.vn 2019-05-28T13:41:36Z 2019-05-28T13:41:36Z http://eprints.uet.vnu.edu.vn/eprints/id/eprint/3457 This item is in the repository with the URL: http://eprints.uet.vnu.edu.vn/eprints/id/eprint/3457 2019-05-28T13:41:36Z Distortion Model based on Perceptual of Local Image Content

As humans are the ultimate receivers of the majority of visual signals being processed, the most accurate way of assessing image quality is to ask humans for their opinions of an image’s quality, known as the subjective visual quality assessment (VQA). The subjective image quality scores gathered from all subjects are processed to be the mean opinion score (MOS), which is regarded as the ground truth of image quality. Due to the fact that the human visual system (HVS) is differently sensitive to features of image patch, a novel coding distortion modelling method for local image perception is proposed in this paper. An experimental quality assessment to approach database for image patch has been developed. Mean opinion score is regarded as an essential parameter meanwhile the QP-MOS sigmoid curve is determined by local image content.

Thanh Tung Pham tung@vinafire.com.vn Trieu Duong Dinh duongdt@vnu.edu.vn Van Xiem Hoang xiemhoang@vnu.edu.vn Tien Vu Huu tienvh@ptit.edu.vn Thanh Ha Le ltha@vnu.edu.vn 2019-05-28T13:41:16Z 2019-05-28T13:41:16Z http://eprints.uet.vnu.edu.vn/eprints/id/eprint/3458 This item is in the repository with the URL: http://eprints.uet.vnu.edu.vn/eprints/id/eprint/3458 2019-05-28T13:41:16Z PHƯƠNG PHÁP MÃ HÓA VÀ GIẢI MÃ VIDEO PHÂN TÁN

Sáng chế đề xuất phương pháp mã hóa và giải mã hóa video phân tán bao gồm quy trình mã hóa và quy trình giải mã, trong đó, phương pháp mã hóa video phân tán được thực hiện trước tiên tại bên mã hóa (bộ mã hóa - WZ encoder) với các bước: phân chia chuỗi video, mã hóa khung hình KEY, mã hóa các khung hình WZ và sau đó thực hiện tại bên giải mã (bộ giải mã WZ decoder) với các bước: giải mã hóa các khung hình KEY, giải mã hóa các khung hình WZ, trong đó việc mã hóa và giải mã các khung hình KEY sử dụng bộ mã hóa/ giải mã JEM (Joint Exploration model)

Van Xiem Hoang xiemhoang@vnu.edu.vn Trieu Duong Dinh duongdt@vnu.edu.vn 2019-05-27T08:30:46Z 2019-05-27T08:30:46Z http://eprints.uet.vnu.edu.vn/eprints/id/eprint/3450 This item is in the repository with the URL: http://eprints.uet.vnu.edu.vn/eprints/id/eprint/3450 2019-05-27T08:30:46Z A Frame Loss Concealment Solution for Spatial Scalable HEVC using Base Layer Motion

Scalable High Efficiency Video Coding (SHVC), is the most recent video coding solution mainly designed for network adaptive or device adaptive applications. SHVC follows a layered coding structure with one base layer (BL) and one or several enhancement layers (ELs) which can be unequally protected. However, SHVC is often sensitive to the packet loss in unreliable networks, especially in case of ELs. In this paper, we propose a novel error concealment method for the SHVC EL with an assumption that the BL is well protected. First, we recover the partitioning and resample the motion data from collocated BL frame. Following, we remove outliers of motion field by a motion vector refinement algorithm. Lastly, we conceal loss frame by using motion compensation and deblocking filter. Experiments conducted with a rich set of test sequences and for the spatial-scalable SHVC standard have shown that our proposed method significantly outperforms the relevant error concealment methods, e.g., BL Reconstruction Up-sampling (RU) and BL-SKIP in both subjective and objective quality assessments.

Huu Thuc Nguyen thuckechsu02@gmail.com Canh Thuong Nguyen Ngcthuong@gmail.com Van Xiem Hoang xiemhoang@vnu.edu.vn Jeon Byeungwoo bjeon@skku.edu 2019-05-27T08:30:27Z 2019-05-27T08:30:27Z http://eprints.uet.vnu.edu.vn/eprints/id/eprint/3452 This item is in the repository with the URL: http://eprints.uet.vnu.edu.vn/eprints/id/eprint/3452 2019-05-27T08:30:27Z A LOW COMPLEXITY WYNER-ZIV CODING SOLUTION FOR LIGHT FIELD IMAGE TRANSMISSION AND STORAGE

Compressing Light Field (LF) imaging data is a challenging but very important task for both LF image transmission and storage applications. In this paper, we propose a novel coding solution for LF images using the well-known Wyner-Ziv information theorem. First, the LF image is decomposed into 4D LF data format. Using a spiral scanning mechanismprocedure, a pseudo-sequence of LF 4D images is generated. This sequence is then compressed in a distributed coding manner as specified in the Wyner-Ziv theorem. In this context, low computational complexity can be achieved at the encoder since the high computational complexity motion estimation part is shifted to the decoderá commonly. In addition, we introduce a novel adaptive frame skipping algorithm to further explore the high correlation between 4D LF images. Experimental results show that the proposed WZ coding based LF image is able to achieve a significant compression gain, notably around 54% bitrate saving when compared with the standard High Efficiency Video Coding (HEVC) Intra benchmark while requiring less computational complexity.

Cong Huy Phi huypc@ptit.edu.vn Perry Stuart Stuart.Perry@uts.edu.au Van Xiem Hoang xiemhoang@vnu.edu.vn 2019-05-27T08:29:52Z 2019-06-03T04:11:31Z http://eprints.uet.vnu.edu.vn/eprints/id/eprint/3454 This item is in the repository with the URL: http://eprints.uet.vnu.edu.vn/eprints/id/eprint/3454 2019-05-27T08:29:52Z A Novel Fusion Method for 3D-TV View Synthesis Using Temporal and Disparity Correlations

View synthesis like Depth-image-based-rendering (DIBR) plays a significant role in 3D content creation for 3D-TV. However, perceptual errors introduced by current view synthesis often result in severe distortions in synthesized images. In this paper, we propose a novel view synthesis fusion (VSF) method which adaptively exploits temporal and disparity correlations to improve the quality of the synthesized picture. The proposed VSF method defines a robust correlation assessment metric for fusing several pre-created virtual view candidates. Unlike conventional methods, the proposed fusion algorithm is applied for both hole and non-hole areas. Experimental results show significantly outperforming peak signal-to noise ratio (PSNR) and subjective visual quality by the proposed method compared to other conventional methods.

Trieu Duong Dinh duongdt@vnu.edu.vn Minh Le Dinh minhle2994@gmail.com Jeon Byeungwoo bjeon@skku.edu Van Xiem Hoang xiemhoang@vnu.edu.vn 2019-05-27T08:29:21Z 2019-06-03T04:11:07Z http://eprints.uet.vnu.edu.vn/eprints/id/eprint/3455 This item is in the repository with the URL: http://eprints.uet.vnu.edu.vn/eprints/id/eprint/3455 2019-05-27T08:29:21Z A Novel Consistent Quality Driven for JEM based Distributed Video Coding

Distributed video coding (DVC) is an attractive and promising solution for low complexity constrained video applications, such as wireless sensor networks or wireless surveillance systems. In DVC, the visual quality consistency is one of the most important issues to evaluate the performance of a DVC codec. However, it is the fact that the quality of the decoded frames achieved in most recent DVC codecs is not consistent and it is varied with high quality fluctuation. To solve the problem, in this paper, we propose a novel DVC solution named JEM based DVC (JEM-DVC), which can provide not only higher performance compared to the traditional DVC solutions but also effective scheme for the quality consistency control. In the proposed JEM-DVC solution, we first employ several advanced techniques provided in the Joint exploration model (JEM) of the future video coding standard (FVC) to effectively improve the performance of JEM-DVC codec. Then, for consistent quality control, we propose two novel methods named key frame quantization (KF-Q) and Wyner-Zip frame quantization (WZF-Q) which determine the optimal values of quantization parametter (QP) and quantization matrix (QM) applied for the key and WZ frame coding, respectively. Unlike the conventional approaches, the optimal values of QP and QM are adaptively controlled and updated for every key and WZ frames to guarantee the consistent video quality for the proposed codec. Our proposed JEM-DVC is the first DVC codec in literature employing JEM coding technique, then all results presented in this paper are new. Experimental results show that the proposed JEM-DVC significantly outperforms the relevant DVC benchmarks, notably the DISCOVER DVC and the recent H.265/HEVC based DVC, in terms of both Peak signal-to-noise ratio (PSNR) performance and consistent visual quality.

Trieu Duong Dinh duongdt@vnu.edu.vn Cong Huy Phi huypc@ptit.edu.vn Van Xiem Hoang xiemhoang@vnu.edu.vn 2019-05-27T08:28:49Z 2019-05-27T08:28:49Z http://eprints.uet.vnu.edu.vn/eprints/id/eprint/3456 This item is in the repository with the URL: http://eprints.uet.vnu.edu.vn/eprints/id/eprint/3456 2019-05-27T08:28:49Z Cooperative Caching in Two-Layer Hierarchical Cache aided Systems

Caching has received much attention as a promising technique to overcome high data rate and stringent latency requirements in the future wireless networks. The premise of caching technique is to prefetch most popular contents closer to end users in local cache of edge nodes, e.g., base station (BS). When a user requests a content that is available in the cache, it can be served directly without being sent from the core network. In this paper, we investigate the performance of hierarchical caching systems, in which both BS and end users are equipped with a storage memory. In particular, we propose a novel cooperative caching scheme that jointly optimizes the content placement at the BS’s and users’ caches. The proposed caching scheme is analytically shown to achieve a larger global caching gain than the reference in both uncoded – and coded caching strategies. Finally, numerical results are presented to demonstrate the effectiveness of our proposed caching algorithm.

Van Xiem Hoang xiemhoang@vnu.edu.vn thi Hang duong hangdt@haui.edu.vn Anh Vu Trinh vuta@vnu.edu.vn Xuan Thang Vu thang.vu85@gmail.com 2018-12-20T06:04:47Z 2018-12-20T06:04:47Z http://eprints.uet.vnu.edu.vn/eprints/id/eprint/3289 This item is in the repository with the URL: http://eprints.uet.vnu.edu.vn/eprints/id/eprint/3289 2018-12-20T06:04:47Z View Synthesis Method for 3D Video Coding Based on Temporal and Inter View Tung Long Vuong Dinh Minh Le minhld_57@vnu.edu.vn Van Xiem Hoang xiemhoang@vnu.edu.vn Trieu Duong Dinh duongdt@vnu.edu.vn Huu Tien Vu Thanh Ha Le ltha@vnu.edu.vn 2018-12-18T02:57:41Z 2018-12-18T02:57:41Z http://eprints.uet.vnu.edu.vn/eprints/id/eprint/3336 This item is in the repository with the URL: http://eprints.uet.vnu.edu.vn/eprints/id/eprint/3336 2018-12-18T02:57:41Z Complexity Controlled Side Information Creation for Distributed Scalable Video Coding

Distributed scalable video coding (DSVC) has recently been gaining many attentions due to its benefits in terms of computational complexity, error resilience and scalability, which are important for emerging video applications like wireless sensor networks and visual surveillance system (VSS). In DSVC, the side information (SI) creation plays a key role as it directly affects to the DSVC compression performance and the encoder/decoder computational complexity. However, for many VSS applications, the energy of each VSS node is usually attenuating along the time, making the difficulty in transmitting surveillance video in real time. To address this problem, we propose a complexity controlled SI creation solution for the newly DSVC framework. To achieve the flexible SI creation, the complexity associated to SI creation process is modeled using a linear model in which the model parameters are estimated from a fitting process. To adjust the SI complexity, a user parameter is defined based on the availability of the VSS energy resource. Experiments conducted for a rich set of video surveillance data have revealed the benefits of the proposed complexity control solution, notably in both complexity control and compression performance.

Quang Hoang Van quanghvdt@gmail.com Le Dao Thi Hue hueledao94@gmail.com Vien Du dinh dudinhvien@gmail.com Vu Nguyen Hong vu.nguyenhong@gmail.com Van Xiem Hoang xiemhoang@vnu.edu.vn 2018-12-17T02:52:03Z 2018-12-17T02:52:03Z http://eprints.uet.vnu.edu.vn/eprints/id/eprint/3290 This item is in the repository with the URL: http://eprints.uet.vnu.edu.vn/eprints/id/eprint/3290 2018-12-17T02:52:03Z Coding distortion modelling method for local image perception Thanh Tung Pham Trieu Duong Dinh duongdt@vnu.edu.vn Van Xiem Hoang xiemhoang@vnu.edu.vn Huu Tien Vu Thanh Ha Le ltha@vnu.edu.vn 2018-12-14T01:42:08Z 2018-12-18T07:02:04Z http://eprints.uet.vnu.edu.vn/eprints/id/eprint/3249 This item is in the repository with the URL: http://eprints.uet.vnu.edu.vn/eprints/id/eprint/3249 2018-12-14T01:42:08Z Cooperative Caching in Two-Layer Hierarchical Cache-aided Systems

Van Xiem Hoang xiemhoang@vnu.edu.vn Hang Duong Anh Vu Trinh vuta@vnu.edu.vn Xuan Thang Vu thang.vu85@gmail.com 2018-11-20T08:58:20Z 2018-11-21T09:07:21Z http://eprints.uet.vnu.edu.vn/eprints/id/eprint/3088 This item is in the repository with the URL: http://eprints.uet.vnu.edu.vn/eprints/id/eprint/3088 2018-11-20T08:58:20Z Joint Layer Prediction for Improving SHVC Compression Performance and Error Concealment

Scalable High Efficiency Video Coding (SHVC) standard is expected to play a more important role in the heterogeneous landscape of broadcasting, multimedia, networks, and various services applications as it is specified as a layered coding technique in the ATSC (Advanced Television Systems Committee) 3.0. However, its block-based structure of temporal and spatial prediction makes it sensitive to information loss and error propagation due to transmission errors. In this context, we propose an improved SHVC with a joint layer prediction (JLP) solution which adaptively combines the decoded information from the base and the enhancement layers to create an additional reference for the SHVC enhancement encoder. To optimize the quality of the joint prediction, the minimum mean square error (MMSE) estimation is executed in computing a combination factor which gives weights to each contribution of the decoded information from the layers. In addition, the proposed JLP is integrated into the SHVC decoder to work as an error concealment solution to mitigate the error propagation happening inevitably in practical video transmission. Experiments have shown that the proposed SHVC framework significantly outperforms its relevant benchmarks, notably by up to 14.8% in bitrate reduction with respect to the standard SHVC codec. The proposed SHVC error concealment strategy also greatly improves the concealed picture quality as well as reducing the problem of error propagation when compared to conventional error concealment approaches.

Van Xiem Hoang xiemhoang@vnu.edu.vn Jeon Byeungwoo bjeon@skku.edu 2018-10-09T09:47:25Z 2018-10-09T09:47:25Z http://eprints.uet.vnu.edu.vn/eprints/id/eprint/3086 This item is in the repository with the URL: http://eprints.uet.vnu.edu.vn/eprints/id/eprint/3086 2018-10-09T09:47:25Z Artificial Intelligence Based Adaptive GOP Size Selection for Effective Wyner-Ziv Video Coding

Wyner-Ziv video coding (WZVC) has been gaining many attentions in recent decades due to its low computational complexity and error resiliency benefits, notably when compared to traditional video coding standards such as H.264/AVC or High Efficiency Video Coding (HEVC) standards. In a WynerZiv video coding scheme, the compression efficiency can be controlled by the length of the group of pictures (GOP) which typically consists of the two key and several WZ frames. However,the current Wyner-Ziv video coding solutions usually employ a fixed GOP size or simple adaptive GOP size mechanisms, which depend on some heuristic features extracted from video content. To address the limitation of the current GOP size adaptation solutions, we propose in this paper a novel Artificial Intelligence based GOP size adaptation mechanism and integrate it into the most advanced transform domain Wyner-Ziv video coding (TDWZ) architecture. In the proposed GOP size adaptation mechanism, the proper GOP size is learnt from the correlation between video features and the optimal compression performance. The power of machine learning techniques is used to select the most suitable video features and the model of GOP size and compression performance correlation. Experimental results shown that, using the obtained GOP size adaptation mechanism, the TDWZ achieved a compression performance when compared to relevant benchmarks.

Thi Huong Thao Nguyen Cong Huy Phi huypc@ptit.edu.vn Huu Tien Vu Van Xiem Hoang xiemhoang@vnu.edu.vn 2017-10-29T06:59:34Z 2017-10-29T06:59:34Z http://eprints.uet.vnu.edu.vn/eprints/id/eprint/2591 This item is in the repository with the URL: http://eprints.uet.vnu.edu.vn/eprints/id/eprint/2591 2017-10-29T06:59:34Z Base Layer Constrained Error Concealment Solutions for Robust SHVC Video Transmission

Considering for a powerful scalable video coding solution, not only in error-free but also in error-prone environment, this paper proposes two error concealment (EC) solutions, which mainly rely on the base layer available information. The proposed error concealment solutions are integrated at the decoder side of the most recent scalable high efficiency video coding (SHVC) standard. The proposed EC solutions are adaptively performed with the coding structure of the SHVC standard, notably the quad-tree division and the high-level syntax approach. Experiments conducted for a rich set of test sequences and conditions have shown the advances of the proposed EC concealments, notably with around 4 dB concealed frame quality improvement when compared to the conventional frame copy approach.

Van Xiem Hoang xiemhoang@vnu.edu.vn Trieu Duong Dinh duongdt@vnu.edu.vn Vu Huu Tien tienvh@ptit.edu.vn Nguyen Huu Thuc 2017-10-29T06:59:30Z 2017-11-06T02:27:51Z http://eprints.uet.vnu.edu.vn/eprints/id/eprint/2589 This item is in the repository with the URL: http://eprints.uet.vnu.edu.vn/eprints/id/eprint/2589 2017-10-29T06:59:30Z A Novel Content Adaptive Search Strategy for Low Complexity Frame Rate Up Conversion

Frame rate up-conversion (FRUC) is an important technique for various film/video conversions, and technology displays due to its benefits on both increasing the viewing quality experiences and reducing the cost of video transmission. However, with the increasing of video resolutions and the exceeding computation associated to the motion estimation (ME) stage, FRUC is hardly suitable for real-time video applications. In this context, we propose a novel ME search strategy solution for low complexity, yet effective FRUC framework. In the proposed FRUC framework, the search strategy, one of the major aspects which directly influences to the FRUC processing time as well as the interpolated frame quality, is adaptively driven by the video content. Both temporal and spatial activities are considered to adjust the number of searching points according to the minimum of mean absolute difference (MAD) between current and reference blocks. Experimental results conducted for a rich set of video sequences shown the advantages of the proposed FRUC scheme, notably on both interpolated frame quality and time processing when comparing to relevant benchmarks.

Van Xiem Hoang xiemhoang@vnu.edu.vn Phi Cong Huy huypc@ptit.edu.vn 2017-10-29T06:59:24Z 2017-12-05T06:25:17Z http://eprints.uet.vnu.edu.vn/eprints/id/eprint/2588 This item is in the repository with the URL: http://eprints.uet.vnu.edu.vn/eprints/id/eprint/2588 2017-10-29T06:59:24Z HEVC based Distributed Scalable Video Coding for Surveillance Visual System

Surveillance visual systems play an important role in modern life, especially in the Internet of Things (IoTs) era. However, the limitation of bandwidth, energy resources and the heterogeneity of devices, networks and environments have been asking for a more powerful video coding solution, which provides not only the high compression efficiency but also the flexible scalability capability. In this context, we propose a novel scalable video coding solution, particularly designed for surveillance video content, which typically contains low motion and static scenes, thus having high temporal redundancy. In the proposed video coding framework, the conventional video coding standard, i.e., High Efficiency Video Coding (HEVC) is wisely combined with the emerging distributed coding paradigm and following a lay-ered coding approach to exploit the high temporal correlation between frames in surveillance video content. As assessed, the proposed surveillance distributed scalable video coding solution significantly outperforms the relevant coding benchmarks, nota-bly with around 36,8% bitrate saving in average when compared to the HEVC simulcasting benchmark.

Van Xiem Hoang xiemhoang@vnu.edu.vn Thi Hue Le Dao hueledao94@gmail.com Trieu Duong Dinh duongdt@vnu.edu.vn 2017-10-29T03:19:37Z 2017-10-29T03:19:37Z http://eprints.uet.vnu.edu.vn/eprints/id/eprint/2590 This item is in the repository with the URL: http://eprints.uet.vnu.edu.vn/eprints/id/eprint/2590 2017-10-29T03:19:37Z Improving SHVC Performance with a Block based Joint Layer Prediction Solution

Considering for the need of a more powerful scalable video coding solution beyond the recent Scalable High Efficiency Video Coding (SHVC) standard, this paper proposes a novel joint layer prediction creation solution. In the proposed improvement solution, the temporal correlation between frames is exploited through a motion compensated temporal interpolation (MCTI) mechanism. The MCTI frame is then adaptively combined with the base layer reconstruction using a linear combination algorithm. In this combination, a weighting factor is defined and computed for each predicted block using the estimated errors associated to each input. Finally, to achieve the highest compression efficiency, the fused frame is treated as an additional reference and adaptively selected using a rate distortion optimization (RDO) mechanism. Experiments conducted for a rich set of test conditions have shown that significant compression efficiency gains can be achieved with the proposed improvement solution, notably by up to 4.5 % in enhancement layer BD-Rate savings regarding the standard SHVC quality scalable codec.

Van Xiem Hoang xiemhoang@vnu.edu.vn 2017-10-29T03:16:53Z 2017-12-17T08:24:10Z http://eprints.uet.vnu.edu.vn/eprints/id/eprint/2587 This item is in the repository with the URL: http://eprints.uet.vnu.edu.vn/eprints/id/eprint/2587 2017-10-29T03:16:53Z Joint Exploration Model based Light Field Image Coding: A Comparative Study

The recent light field imaging technology has been attracting a lot of interests due to its potential applications in a large number of areas including Virtual Reality, Augmented Reality (VR/AR), Teleconferencing, and E-learning. Light Field (LF) data is able to provide rich visual information such as scene rendering with changes in depth of field, viewpoint, and focal length. However, Light Field data usually associates to a critical problem - the massive data. Therefore, compressing LF data is one of the main challenges in LF research. In this context, we present in this paper a comparative study for compressing LF data with not only the widely used image/video coding standards, such as JPEG-2000, H.264/AVC, HEVC and Google/VP9 but also with the most recent image/video coding solution, the Joint Exploration Model. In addition, this paper also proposes a LF image coding flow, which can be used as a benchmark for future LF compression evaluation. Finally, the compression efficiency of these coding solutions is thoroughly compared throughout a rich set of test conditions.

Cong Huy Phi huypc@ptit.edu.vn Stuart Perry Stuart.Perry@uts.edu.au Anh Vu Trinh vuta@vnu.edu.vn Van Xiem Hoang xiemhoang@vnu.edu.vn 2017-10-29T03:16:22Z 2017-10-29T03:16:22Z http://eprints.uet.vnu.edu.vn/eprints/id/eprint/2586 This item is in the repository with the URL: http://eprints.uet.vnu.edu.vn/eprints/id/eprint/2586 2017-10-29T03:16:22Z A statistical search range adaptation solution for effective frame rate up - conversion

The recent development of advanced television systems has demonstrated a need for an efficient video conversion technique. In this scenario, frame rate up conversion (FRUC) solutions play an important role due to their benefits in both increasing the viewing quality experience and reducing the cost of video transmission. However, with the recent increase in video resolution, notably from Standard Definition (SD) to High Definition (HD) and ultra HD, FRUC now requires not only better interpolated frame quality but also lower FRUC time processing. Considering this problem, this paper proposes a novel statistical learning based adaptive search range solution to enable an effective FRUC mechanism. In the proposed adaptive search range solution, a set of spatial-temporal features are carefully defined and exploited to adaptively assign an appropriate search range value to each considered block, notably by formulating the search range adaptation as a classification problem and using the well-known support vector machine framework for the classification task. Experimental results conducted for a rich set of common video test sequences shows the advantages of the proposed adaptive search range solution, notably in both interpolated frame quality improvement and time processing reduction.

Van Xiem Hoang xiemhoang@vnu.edu.vn 2017-10-29T03:11:24Z 2017-12-18T08:32:20Z http://eprints.uet.vnu.edu.vn/eprints/id/eprint/2570 This item is in the repository with the URL: http://eprints.uet.vnu.edu.vn/eprints/id/eprint/2570 2017-10-29T03:11:24Z An Online SVM based Side Information Creation for Efficient Distributed Scalable Video Coding

With the significant increase of the network heterogeneity and the wide use of emerging video applications such as wireless sensor networks, video surveillance systems or remote sensing, the Distributed Scalable Video Coding (DSVC) is a potential solution for efficiently transmitting and storing video data due to its high compression efficiency and low encoding complexity capabilities. In DSVC framework, Side Information (SI), created at the decoder side by exploiting the temporal and inter-layer correlations between decoded frames, plays an important role as it directly affects to the final DSVC coding performance. Therefore, this paper proposes a novel SI creation solution which explicitly formulates the SI creation as a classification problem and employs an online learning Support Vector Machine (SVM) engine to fuse several SI candidates. Experiments conducted for a rich set of test sequences show that the proposed SI creation solution significantly outperforms the previous DSVC SI creation methods in terms of SI quality while slightly introducing the computational complexity.

Van Xiem Hoang xiemhoang@vnu.edu.vn Nguyen Thi Huong Thao thaotb07@gmail.com 2016-12-12T16:48:54Z 2016-12-12T16:48:54Z http://eprints.uet.vnu.edu.vn/eprints/id/eprint/2036 This item is in the repository with the URL: http://eprints.uet.vnu.edu.vn/eprints/id/eprint/2036 2016-12-12T16:48:54Z Side information creation using adaptive block size for distributed video coding Thi Huong Thao Nguyen Huu Tien Vu Van San Vu Van Xiem Hoang xiemhoang@vnu.edu.vn Thanh Ha Le ltha@vnu.edu.vn Trieu Duong Dinh duongdt@vnu.edu.vn 2016-12-12T03:59:27Z 2016-12-12T03:59:27Z http://eprints.uet.vnu.edu.vn/eprints/id/eprint/2035 This item is in the repository with the URL: http://eprints.uet.vnu.edu.vn/eprints/id/eprint/2035 2016-12-12T03:59:27Z Spatial - Temporal Feature Extraction based Adaptive Search Range for Effective Frame Rate Up - Conversion Van Xiem Hoang xiemhoang@vnu.edu.vn Duong Trieu Dinh duongdt@vnu.edu.vn Thanh Ha Le ltha@vnu.edu.vn 2016-12-08T03:40:02Z 2016-12-08T03:40:02Z http://eprints.uet.vnu.edu.vn/eprints/id/eprint/1992 This item is in the repository with the URL: http://eprints.uet.vnu.edu.vn/eprints/id/eprint/1992 2016-12-08T03:40:02Z Spatial-Temporal Feature Extraction based Adaptive Search Range for Effective Frame Rate-Up Conversion

Frame rate up conversion (FRUC) has been playing an important role in the recent development of advanced television systems due to its benefits on both increasing the viewing quality experiences and reducing the cost of video transmission. However, with the increasing of video resolutions, notably from Standard Definition (SD) to High Definition (HD), FRUC is now asked to provide not only better interpolated frames quality but also lower time processing. Therefore, in this paper, we propose a novel spatial – temporal feature extraction based adaptive search range for effective FRUC. In the proposed adaptive search range scheme, a set of temporal and spatial features are carefully defined and exploited to adaptively assign an appropriate search range value to each considered block; thus, directly reducing the FRUC time processing. Moreover, since the optimal search range can be employed; the quality of interpolated frames is significantly improved. Experimental results conducted for a rich set of video test sequences shown the advantages of the proposed FRUC scheme, notably in both subjective objective image quality improvement and time processing reduction.

Van Xiem Hoang xiemhoang@vnu.edu.vn Trieu Duong Dinh duongdt@vnu.edu.vn Thanh Ha Le ltha@vnu.edu.vn 2016-12-01T06:07:44Z 2016-12-01T06:08:00Z http://eprints.uet.vnu.edu.vn/eprints/id/eprint/1998 This item is in the repository with the URL: http://eprints.uet.vnu.edu.vn/eprints/id/eprint/1998 2016-12-01T06:07:44Z Improving 3D-TV View Synthesis Using Motion Compensated Temporal Interpolation

Nowadays, the development of three-dimension (3D) video applications such as three-dimensional television (3D-TV) and free-viewpoint television (FTV) has greatly increased human experiences. View synthesis method like depth-image-based-rendering (DIBR), plays a significant role in 3D content creation, 3D transmission, and has been integrated into video coding standards such as 3D-High efficiency video coding (3D-HEVC). However, the current DIBR method employs only the disparity correlation between views to create a so-called synthesized view; thus, unable to take full advantages of available synthesized information. In this paper, we propose a novel view synthesis method which takes advantages of not only the disparity correlation but also the temporal correlation between views. In the proposed method, an effective motion compensation based frame interpolation is employed to generate a temporal prediction view which is then combined with the DIBR rendered view to obtain the final synthesized view. Experimental results show that the proposed method can achieve the synthesized view with significantly outperforming other conventional techniques in terms of both peak signal-to noise ratio (PSNR) and subjective visual quality.

Dinh Minh Le minhld_57@vnu.edu.vn Tung Long Vuong Van Xiem Hoang xiemhoang@vnu.edu.vn Trieu Duong Dinh duongdt@vnu.edu.vn Thanh Ha Le ltha@vnu.edu.vn 2016-05-28T03:57:33Z 2017-01-06T09:25:05Z http://eprints.uet.vnu.edu.vn/eprints/id/eprint/1689 This item is in the repository with the URL: http://eprints.uet.vnu.edu.vn/eprints/id/eprint/1689 2016-05-28T03:57:33Z Improving SHVC Performance with a Joint Layer Coding Mode

The growing need for a powerful scalable video coding engine targeting the heterogeneous landscape of network, devices, and consumption environments has led to the development of the Scalable High Efficiency Video Coding (SHVC) standard, an extension of the High Efficiency Video Coding (HEVC) standard. To improve the SHVC compression efficiency, this paper proposes a novel joint layer coding mode to be integrated in the SHVC codec. In the proposed coding mode, the base layer (BL) and enhancement layer (EL) decoded information are linearly combined at the pixel level to create an additional coding mode. To fuse the BL and EL driven predictions, a weighting term is defined to indicate the contributions of each of them for the final joint layer prediction. To reach high adaptability, these weights are computed at pixel level in the prediction unit. Moreover, to achieve the highest compression efficiency, the proposed joint layer coding mode is adaptively selected using a rate distortion optimization (RDO) mechanism. Experiments conducted for a rich set of test conditions have shown that significant compression efficiency gains can be achieved with the proposed joint layer coding mode, notably up to 4.3 % in BD-Rate savings regarding the standard SHVC quality scalable codec.

Van Xiem Hoang xiemhoang@vnu.edu.vn Joao Ascenso A.Joao@gmail.com