Simultaneous camera and hand-eye calibration for eye-on-hand systems

16 trang Gia Huy 4580

Download

Bạn đang xem tài liệu "Simultaneous camera and hand-eye calibration for eye-on-hand systems", để tải tài liệu gốc về máy bạn click vào nút DOWNLOAD ở trên

Tài liệu đính kèm:

simultaneous_camera_and_hand_eye_calibration_for_eye_on_hand.pdf

Nội dung text: Simultaneous camera and hand-eye calibration for eye-on-hand systems

Journal of Science and Technique - Le Quy Don Technical University - No. 210 (9-2020) SIMULTANEOUS CAMERA AND HAND-EYE CALIBRATION FOR EYE-ON-HAND SYSTEMS Nguyen Huu Hung1, Nguyen Quang Thi1, Tran Cong Manh1 Abstract Determination of the location and orientation of objects in the robot workspace is a fundamental function of manufacturing automation. This problem is solved by using a robot vision system with a camera mounted on the robot end-effector with a known hand-eye transformation. In this case, a viable solution to deal with the complexity of calibration is of necessity, including the calibration of internal and external parameters associated with the camera as well as the calibration of the hand-eye parameters. To this end, the paper presents a simple and efficient method of calibration for a camera-on-hand system, where the internal and external parameters of a 2D camera as well as the hand-eye parameters are simultaneously calibrated. The method is based on the 3D-to-2D projections of a calibration- block with minimum two pure translations and two pure rotations of hand motions. Being evaluated on simulation data and real robot vision system indicate that our method can work stably with different noise and number of stations. Index terms Camera Calibration, Hand-Eye Calibration. 1. Introduction The robot vision system consists of one or more cameras and a robot or robots in industry field for various applications such as bin picking, modeling [1], robotic grasping [2] and medical procedures [3]. The measurement accuracy of the robot vision system relies on each component in the system containing camera parameters, hand- eye parameters and the hand’s repeatability accuracy, obviously. Usually, the calibration process for the robot-vision system is done separately in turn, camera calibration first and then hand-eye calibration later. Especially, the images used for camera calibration are not reused for hand-eye calibration process. Camera calibration is a necessary process in 3D computer vision in order to solve the unknown parameters of the camera model. It is performed by observing a calibration object whose geometry in 3D space is known with the high precision. The calibration object usually contains one, two or three planes perpendicular to each other. Much work has been done, for example [4], [5], [6]. The approaches using one plane, named planar chessboard, usually require an expensive apparatus and an elaborate setup. 1 Institute of System Integration, Le Quy Don Technical University 23
Section on Information and Communication Technology (ICT) - No. 15 (9-2020) Usually, the hand-eye calibration problem is formulated as solving homogeneous transformation equations of the forms AX = XB [7] (*), where X is the 4 homogeneous transformation from the robot hand coordinate frame to the sensor coordinate frame, A and B are the measurable 4 homogeneous transformations of the robot hand and camera from its first to second position. Several closed-form solutions were proposed to solve for X such as [7] and [8]. The unknown hand-eye transformation also can be estimated by solving AX = ZC where A is the known homogeneous transformation from hand pose measurements, C is computed using the calibrated manipulator internal-link forward kinematics, X is the unknown transformation from the robot hand frame to sensor frame, and Z is the unknown transformation from the world frame to the robot-base frame. Such problem has been solved in [9] and [10]. This hand-eye calibration process is independent to camera calibration. Hand-eye calibration by teaching pendant moving the robot hand to a sequence of location repeatedly also has been used for several decades. This teaching robot moving to pick chessboard corner was known as to be dangerous for operators and time consuming. Any incorrect operation could cause severe injury people close to the robot. Note that, for above-mentioned approaches, to successfully obtain the unknown transformation X accurately, it is necessary to solve a non-linear system obtaining from multiple hand motions and images captured by camera. Additionally, the camera should be calibrated previously and separately. Recently, simultaneous hand-eye calibration was done by using chessboard corners at multiple hand positions [11]. This methodology requires multiple hand motions so it shares the same manner with teaching pendant method. Fig. 1. The proposed simultaneous hand-eye calibration. In this paper, we focus on reducing the elaborating time by introducing an approach which simultaneously calibrates camera and camera-hand using 3D calibration-block by minimum two pure hand translations and two pure hand rotations, meaning four hand motions in total. Firstly, the camera was calibrated successfully at each robot stations by taking advantage of 3D calibration-block which contains two orthogonal planes. 24
Journal of Science and Technique - Le Quy Don Technical University - No. 210 (9-2020) Secondly, hand-eye rotation is estimated by minimum two pure hand translations and then the hand-eye translation was obtained by minimum two pure hand rotations with the pre-estimated hand-eye rotation. Fig. 1 summarizes the step-by-step procedure of the proposed approach. The remaining of this paper is organized as follows. In Section II, background of cam- era model and traditional hand-eye calibration are summarized. In Section III, a process of camera calibration is estimated by finding projection matrices and decomposing into internal and external parameters. Hand-eye calibration based on pure hand translation and rotation motions is described in Section IV. Experimental results on simulated and real data are shown in Section V. 2. Preliminaries 2.1. Camera model In pinhole camera system, the relation between a 3D world point W X = [x, y, z, 1]T and a 2D point in image plane is known as the full perspective camera model [12]. W   C C C x u f s c  R R R t  u u 11 12 13 x y v = 0 f c R R R t   (1)    v v  21 22 23 y z 1 0 0 1 R R R t   W 31 32 33 z 1 This includes a homogeneous transformation matrix which transforms the 3D world point to camera coordinate and a projection matrix from camera coordinate to image co-ordinate in pixel. Equation (1) can be rewritten in a compact way C C C C W p = K[W R W t] P (2) or be compacted as C C W p = W M P (3) where matrix 3 × 3 C K is intrinsic matrix or camera matrix contains the internal C C parameters of camera, rotation matrix 3 × 3 R and translation matrix 3 × 1 W t are external matrix represent the transformation from world coordinate to camera coordinate. C C Matrix 3 × 4 W M is called as projection matrix. In this case, the pair 2D point p and 3D point W P is called a 2D−3D correspondence. The calibration process is to estimate intrinsic matrix C K from 2D − 3D correspondences. 25
Section on Information and Communication Technology (ICT) - No. 15 (9-2020) Fig. 2. Camera Calibration Process using Calibration-Block (left) Captured Image (middle) 7 × 7 line detection, (right) 7 × 7 line intersections with sub-pixel refinement. 2.2. Hand-eye calibration In a vision system with camera mounted on hand, the well-known hand-eye calibration equation is expressed as AX = XB where A is hand motion, B is camera motion and X is transformation from the camera to the end-effector that is necessary to estimate correctly. From equation (∗), two following constraints should be satisfied the RARX = RX RB (4) (I − RA)tX = tA − RX tB (5) Several approaches have been proposed for the estimation of RX from equation (4), for instance, using the rotation axis and angle [13], quaternions [14] and canonical matrix representation [9]. After that, the estimation of translation is done by solving the Pseudo-Inverse and then can be refined by nonlinear optimization [8]. For these approaches, the hand motions include both translation and rotation components. 3. Camera Calibration 3.1. Calibration-block and corner detection Calibration-block contains two orthogonal planar chessboards 8×8 with 20 mm square size. That means there are 7 × 7 = 49 inner corners in each surface and 98 corners in total. To detect these corners, consecutively, we performed Hough Line Transform method to extract lines, then the initial location of inner corners are determined by line intersections, and finally the sub-pixel process was performed to accurately detect location inner corners. These main steps to detect chessboard corners are summarized in Fig. 2. 3.2. Camera calibration At each robot station, the camera was calibrated; 3 × 4 projection matrix was esti- mated from solving linear equations then was decomposed into intrinsic and extrinsic 26
Journal of Science and Technique - Le Quy Don Technical University - No. 210 (9-2020) parameter. Specifically, equation (3) can be written detailed as W   C C C x u f s c  R R R t  u u 11 12 13 x y v = 0 f c R R R t   (6)    v v  21 22 23 y z 1 0 0 1 R R R t   W 31 32 33 z 1 m11xi + m12yi + m13zi + m14 ui = (7) m31xi + m32yi + m33zi + m34 m21xi + m22yi + m23zi + m24 vi = (8) m31xi + m32yi + m33zi + m34 Equations (7) and (8) are written in compact way as W T W T Pi 0 ui Pi W T W T m = 0 (9) 0 Pi vi Pi where m = [m11m12m13m14m21m22m23m24m31m32m33m34]. To find each element of matrix M, it is necessary to solve the linear equation Am = 0. The simple way to solve (9) is to find the minimum eigenvector by minimizing the following objective function minkAm¯ k (10) with a constraint km¯ k = 1. The solution of m vector is normalized, however matrix C   m¯ 11 m¯ 12 m¯ 13 C C KW R = m¯ 21 m¯ 22 m¯ 23 (11) m¯ m¯ m¯ W 31 32 33 2 2 2 where m¯ 31 +m ¯ 32 +m ¯ 33 = 1. For that reason, we have   m¯ 11 m¯ 12 m¯ 13 m¯ 21 m¯ 22 m¯ 23 C C m¯ 31 m¯ 32 m¯ 33 KW R = 2 2 2 (12) sqrt(m ¯ 31 +m ¯ 32 +m ¯ 33) And then intrinsic parameters and rotation matrix from world to camera are obtained by QR decomposition. Finally, the translation from world to camera is calculated by solving   m14 C C KW t = m24 (13) m34 27
Section on Information and Communication Technology (ICT) - No. 15 (9-2020) Fig. 3. Robot vision system with a camera mounted on hand. 4. Hand-eye Calibration In this section, we propose a simplifying hand-eye calibration using at least two pure hand translations to estimate the hand-eye rotation and two pure hand rotations to obtain hand-eye translation with known hand-eye rotation. Fig. 3 shows two positions of the robot vision system. Hand position is so called a station. For other approaches, the hand motions including both translation and rotation components. However, the proposed approach takes advantage using special motions: pure translation and pure rotation. The hand can be controlled by human so hand motion is obtained by H1 T = H1 T H2 T −1 H2 B B (14) When camera is calibrated to world coordinate so camera motion also easily obtained C1 T = C1 T C2 T −1 C2 W W (15) The hand motion and camera motion share a constraint AX = XB. This constraint is rewritten as H1 T = H1 T H2 T = H1 T C1 T C2 H2 C2 C1 C2 (16) or in detail as H1 R H1 t H2 R H2 t = H1 R H1 t C1 R C1 t (17) H2 H2 C2 C2 C1 C1 C2 C2 The rotation constraint is H1 R H2 R = H1 R C1 R H2 C2 C1 C2 (18) H H H C R C R = C R R (19) and, the translation constraint is H1 t + H1 R H2 t = H1 t + H1 R C1 t H2 H2 C2 C1 C1 C2 (20) 28
Journal of Science and Technique - Le Quy Don Technical University - No. 210 (9-2020) Finally, the translation component is described by hand motion and camera motion as H H H H H C t + R C t = C t + C R t (21) In case of pure translation motion, there is no change in rotation that means H R = I H H C and then equation (21) becomes t = C R t(∗∗). Many authors noticed that at least three H C C pairs of ( t, t) are necessary in order to uniquely determine H R. However, rotation C H R is orthogonal matrix so cross vector two hand motions and two camera motion has relationship H H H H H t1 × t2 = C R( t1 × t2) (22) For that reason, we need at least two pairs of hand motion to obtain the hand-eye rotation. Assume there are N pair of hand motion and camera motion. Matrix Λ is collection of hand motions and matrix Ψ is collection of camera motions. From equation (22), relationship between Λ and Ψ is written in a closed-form as H Λ3×N = C RΨ3×N (23) In case N = 2, the third column of matrix Λ and Ψ is replaced by cross product of two hand motions and camera motions. Otherwise, in case number of stations is larger H than 2, rotation C R is easily obtain by using SV D T [U, S, V ] = SV D(Ψ3×N Λ3×N ) (24) H T C R = VU (25) Until now, the hand-eye rotation is estimated, it is necessary to estimate the hand-eye translation. With known hand-eye rotation we can rewrite equation (21) as H H C H H t − C R t = (I − R) C t (26) For each hand rotation, we can have equation (26) with three linear equation, but matrix (I − H R) has rank 2 so we need at least two hand motion in pure rotation to H estimate translation C t. 5. Experimental Results To verify the proposed method, we do experiment on both simulation data and real data which are captured by a robot vision system including a Computar camera with focal length 12 mm resolution 1280×960 mounted on Schunk LWA3 robot. Simulation data contained 20 pairs of hand-eye motion including 10 motion in pure translation and 10 motions in pure rotation. The hand motion is in a limitation 100 mm for translation direction, 20 degrees for rotation components. Noise level for rotation and translation were set to 0.05 degree and 0.05 mm, respectively. To verify our approach in real situation, we choose randomly a small number of motions from a dataset containing 60 pure translations and 50 pure rotations captured from above-mentioned system. 29
Section on Information and Communication Technology (ICT) - No. 15 (9-2020) Fig. 4. Configuration of our robot vision system. The configuration of our robot vision system is shown in Fig 4. Firstly, we considered the hand error. There are two errors related to hand: physical error and repeatability error. The repeatability error can be roughly estimated by moving the hand from fixed positions to different positions. The arm controller can provide the joint information and the hand location in Base coordinate. By controlling the hand from the default position to a chosen position 60 times and measuring the standard B deviation of transformation from Hand (H) to Base (B) H T , the repeatability is around 0.2 mm for translation and 0.05 degree for rotation. It is summarized in Table 1. Table 1. Statistic Hand Repeatability Error Rx (deg) Ry (deg) Rz (deg) Tx (mm) Ty (mm) Tz (mm) B-H 0.047 0.02 0.037 0.14 0.1 0.12 However, it is not easy to measure the physical error directly. To measure this error roughly, we evaluated the hand-eye system by evaluating the relative translations of pure hand translation and corresponding camera motion instead. In ideal cases, the relative motions of hand and camera are identical. Due to the mechanic error, two motions are different with a very small change. This difference indicates the error of hand-eye system. To this end, we measured the hand motion and camera motion by moving hand 60 times in pure translation and then created a histogram of difference between two motions is shown in Fig. 5. It shows the mean and standard deviation of this difference to be 0.94 ± 0.65. Fig. 5 indicates that the physical error of robot hand is around 0.9 mm. The system was evaluated on accuracy and precision for camera calibration and hand eye calibration. To make the evaluation understandable, we introduced the evaluation methodology which was used for evaluating camera calibration and hand-eye calibration, consecutively. 30
Journal of Science and Technique - Le Quy Don Technical University - No. 210 (9-2020) Fig. 5. Difference of hand and camera relative translation histogram. 5.1. Evaluation Methodology This section introduces how the system was evaluated and the metric used to evaluate performance. For camera calibration, we measured precision of intrinsic and extrinsic parameters by measuring distribution of 20 calibration times at the same position. For hand-eye calibration, the performance was evaluated in both simulation and the real data. For accuracy evaluation, it is necessary to have metrics. We took advantage of metric of rotation and translation proposed in [7]. Assume there are two measurements of same H0 ˜ H0 ˆ H0 ˜ transformation H1 T and H1 T where H1 T is calculated from two robot hand location, and H0 ˆ H0 ˜ H1 ˆ H1 T estimated from camera motions. Residual error rotations such as Re = H1 R H0 R. The rotation error can be expressed as trace(R ) − 1) O = ± arccos( e ) (27) rot 2 And the metric for translation error is expressed as H H k H0 t˜− 0 tˆk + k H1 t˜− 1 tˆk O = H1 H1 H0 H0 (28) tranl 2 From multiple pair of hand-cam motion, the error is measured as the variance N 1 X σ2 = O2 (29) rot N rot i=1 N 1 X σ2 = O2 (30) transl N transl i=1 Precision of our system is evaluated on both simulated and real data. For simulated data, there is a hand-eye transformation ground-truth that is obviously used for evaluating. There is no ground-truth for real data, so it is necessary to measure the distribution of estimation. 31
Section on Information and Communication Technology (ICT) - No. 15 (9-2020) 5.2. Camera Calibration Evaluation At a fixed hand position, we captured calibration-block images 20 times, did cali- bration, and then measured the distribution the intrinsic and extrinsic parameters. In addition, we measured the distribution of re-projection error which was the distance between observation and projected points. The result is summarized in Table 2. Table 2. The accuracy of camera intrinsic parameters Camera Calibration (pixel) Focal length fu 2544.05 ± 2.07 Focal length fv 2554.3 ± 2.14 Skew −4.7 ± 0.13 Principal Cu 599.5 ± 1.9 Principal Cv 500.3 ± 1.6 Re-projection error 0.26 ± 0.04 Each row of in Table 2 includes the mean and variance. The first two rows indicate that the variance of intrinsic parameters were less than 2 pixels. Finally, we measured C the extrinsic parameters from calibration-block to camera W T by measuring their dis- tribution. Table 3. The accuracy of camera extrinsic parameters Rx (deg) Ry (deg) Rz (deg) Tx (mm) Ty (mm) Tz (mm) C-W 103.7 ± 0.07 −50.9 ± 0.03 168.3 ± 0.07 27.5 ± 0.61 −44.7 ± 0.53 810.4 ± 0.71 Additionaly, we make a comparison our approach to the one in [4] which used the planar chessboard instead of using calibration-block as we used by measuring the reprojection errors. For method in [4], we measured the reprojection with different number of images. For our method we use only one image. The results shown in Fig. 6 indicate that the reprojection error of chessboard’s method reduces gradually and converges around 10-20 images. With only one image, the reprojection error of our method around 0.26 pixel is equivalent to the reprojection error of chessboard’s method at around 6 and 7 images. 0.8 Planar Chessboard 3D Calibration Object 0.6 0.4 0.2 Reprojection Error 0 5 10 15 20 25 Number of images Fig. 6. Reprojection errors of two methods: 1) The conventional method using multiple planar chessboard. The proposed method using 3D calibration-block 32
Journal of Science and Technique - Le Quy Don Technical University - No. 210 (9-2020) 5.3. Hand-Eye Calibration Evaluation Hand-eye calibration results were compared to other methods on simulation and the real data. Firstly, we made comparison the proposed algorithm to the traditional methods listed as follows: 1) using dual-quaternions and Kronecker product proposed in [10], 2) quaternion presentation proposed by Dormaika [8], 3) Kronecker product proposed by Shah [15] and 4) solving homogeneous transformation equations [16] .These methods focused on solving simultaneous hand-eye/robot-world calibration in general case that means the camera/hand motions are randomly chosen. Because of that our proposed method took advantage of pure motions in rotation and translation to simplify the hand- eye calibration process. For that reason, the popular hand-eye dataset with random hand motions is not suitable for evaluating our method. To fairly compare our proposed algorithm to four methods mentioned above, the simulation dataset and real dataset were generated with only pure rotation and translation motions. Note that the inputs of our method are hand motion and camera motions while that of other methods are camera motion and transformation from the end-effector to the robot base. The hand motions were easily converted to that type of transformation by integrating with transformation of a fixed end-effector position in the robot base coordinate. As mentioned in [17], methods in [10] and [15] show good performance. We measured rotation and translation error and the difference of estimated value to the ground-truth. Fig. 7 shows the results on simulation data. Horizontal axis represents the number of hand motions, each robot hand position called a station. For our method, the number of pure translation and pure rotation motion is similar that means if there are total Nsim motions then the number of pure translation is Nsim/2 + 1 and number of pure rotation is Nsim/2. This assignation is applied for all evaluations in experiment. 0.2 5 liang liang R T 0.18 dornaika 4 dornaika R T shah shah R T 0.16 3 zhuang zhuang R T our our T 0.14 R 2 0.12 1 Translation Error (mm) Rotation Error (degree) 0.1 0 4 6 8 10 12 14 16 18 20 4 6 8 10 12 14 16 18 20 Number of stations Number of stations Fig. 7. Comparison rotation (left) and translation (right) error of the proposed algorithm to the popular approaches on simulation data Compare to other methods, the proposed approach has stable rotation error even though the number of station is smaller. The errors of other methods are reduced since number of stations increases. However, the translation of proposed method is smallest compare to the others at almost every station. Fig. 8 proves that the 6DOF estimated by proposed method is closest to the ground- 33
Section on Information and Communication Technology (ICT) - No. 15 (9-2020) 88.5 2 liang R our R x x 88 1 -79.8 dornaika R GT R x x -80 87.5 0 (degree) (degree) shah R (degree) x y x z -80.2 87 -1 -80.4 86.5 -2 liang R -80.6 liang R y z 86 -3 our R dornaika R -80.8 our R dornaika R y y z z 85.5 -4 GT R shah R -81 GT R y y z shah R Estimation of R z Estimation of R 85 -5 Estimation of R -81.2 4 6 8 10 12 14 16 18 20 4 6 8 10 12 14 16 18 20 4 6 8 10 12 14 16 18 20 Number of stations Number of stations Number of stations 0.055 -0.04 -0.04 0.05 -0.05 -0.05 (m) (m) (m) x 0.045 y z -0.06 -0.06 0.04 -0.07 -0.07 0.035 liang T liang T liang T x -0.08 y -0.08 z 0.03 our T dornaika T our T our T dornaika T x x y dornaika T z z -0.09 y -0.09 Estimation of T Estimation of T Estimation of T 0.025 GT T shah T GT T shah T GT T shah T x x y y z z 0.02 -0.1 -0.1 4 6 8 10 12 14 16 18 20 4 6 8 10 12 14 16 18 20 4 6 8 10 12 14 16 18 20 Number of stations Number of stations Number of stations Fig. 8. Comparison 6DOF estimation of the proposed algorithm, conventional approaches to the ground-truth on simulation 0.9 10 liang R liang 9 T dornaika R dornaika 0.85 T shah 8 shah R T zhuang zhuang R 7 T 0.8 our our R 6 T 5 0.75 Translation Error (mm) Rotation Error (degree) 4 0.7 3 4 6 8 10 12 14 16 18 20 22 4 6 8 10 12 14 16 18 20 22 Number of stations Number of stations Fig. 9. Comparison rotation (left) and translation (right) error of proposed algorithm to popular approaches on real environment truth while other methods are more sensitive to the noise. For the real data, we follow two statistics to evaluate the proposed method: 1) Compare the accuracy to other methods by a small set of data, and 2) Evaluate detail the accuracy and precision on a large data set including 60 pure translations and 50 pure rotations. In order to compare the accuracy of proposed hand-eye calibration with other methods, we used a real data set including 11 pure translations and 10 pure rotations to evaluate the two erroneous metric of rotation and translation described in (29) and (30). The horizontal is the number of stations. If Nreal is the station number then there is are Nreal/2 + 1 pure translations and Nreal/2 pure rotation. Fig. 9 shows the comparison results, which indicate that our method has higher rotation error than others but has less translation error than others. Especially, the proposed method could provide stable result even though the number of stations is smaller. This small error is obtained due to the benefit in using pure rotation and pure translation hand motions. 34
Journal of Science and Technique - Le Quy Don Technical University - No. 210 (9-2020) Note that, in hand-eye calibration process time for controlling robot is far longer than that for running algorithm. Instead measuring the algorithm running time, we can use the number of camera/hand motion to evaluate the efficiency of methods. The translation and rotation error results in Fig. 7 and Fig. 9 indicate our errors with 4 5 stations are equivalent to other method errors at 8 12 stations. Means that our proposed method is more efficient to others Fig. 10. Our system precision, the distribution of rotation and translation components To evaluate the accuracy and precision of our system in detail, we captured a data con- taining 60 pure translations and 50 pure rotations. Next, a set with nT pure translations and nR pure rotations was selected randomly and repeated 100 times. We measured the distribution of estimated parameters. Firstly, the precision of our hand-eye system with nT = 8, nR = 8 is shown, for instance, in Fig. 10. And Table 4 shows standard deviation of hand eye transformation with different number of pure translations nT and pure rotations nR. Additionally, we also measured accuracy by measuring the distribution of rotation and translation error with different number of translations and rotations. Table 4 shows the statistic the rotation and translation of hand eye calibration with different number of translations and rotations repeated 100 times. It is visualized by Fig. 10, and indicates that the error of our system is around 0.8 degree in rotation and 4 mm in translation. This accuracy is sufficient requirement for several applications such as object recognition and pose estimation or point cloud registration. Both translation and rotation errors reduce since the number of stations increases, however, it takes more time to control arm and do calibration. Because of that reason we need to trade-off between required accuracy and the number of hand motion. 35
Section on Information and Communication Technology (ICT) - No. 15 (9-2020) Table 4. Accuracy of our system. Standard deviation of hand-eye transformation nT nR T otal Rx Ry Rz Tx Ty Tz 2 2 4 0.86 0.48 0.92 9.60 12.10 10.71 3 2 5 0.85 0.47 0.89 9.27 11.78 10.31 3 3 6 0.86 0.45 1 6.94 8.11 7.01 4 3 7 0.83 0.41 0.9 8.13 7.26 6.71 4 4 8 0.78 0.39 0.79 7.3 6.66 6.2 5 4 9 0.72 0.39 0.69 8.12 6.93 5.36 5 5 10 0.68 0.41 0.85 6.94 6.39 5.34 6 5 11 0.58 0.36 0.64 6.48 6.01 5.13 6 6 12 0.64 0.34 0.76 6.25 5.54 4.86 7 6 13 0.68 0.35 0.68 5.41 5.72 4.81 7 7 14 0.61 0.33 0.58 5.13 5.78 4.68 8 7 15 0.51 0.27 0.6 4.89 4.96 3.66 8 8 16 0.53 0.3 0.59 5.53 3.97 3.94 9 8 17 0.47 0.28 0.55 5.43 4.43 3.55 9 9 18 0.48 0.24 0.54 4.58 4.56 4.29 10 9 19 0.37 0.2 0.42 4.76 4.68 3.94 11 10 20 0.35 0.23 0.4 4.66 4.36 3.52 11 10 21 0.39 0.22 0.39 4.21 3.94 3.17 6 0.83 ErrTrans ErrRot ErrTrans-SD 0.82 ErrRot-SD 5.5 ErrTrans+SD ErrRot+SD 0.81 5 0.8 4.5 0.79 4 0.78 Translation Error (mm) Rotation Error (degree) 0.77 3.5 4 6 8 10 12 14 16 18 20 22 4 6 8 10 12 14 16 18 20 22 Number of stations Number of stations Fig. 11. The rotation (left) and translation (right) errors along the number of stations 6. Conclusions We proposed a calibration solution for a robot vision system with a camera mounted on robot hand. Evaluation on simulation indicates that the estimation of hand-eye transformation of the proposed method is closest to the ground-truth than others even though with higher rotation error and smaller translation error. The results on real system with robot hand 1.0 mm repeatability indicates that the accuracy of our solution is 0.8 degree in rotation and 4.0 mm. Experiment on simulation and real data also indicates that our proposed algorithm can work with small number of stations. References [1] J. Kim, H. H. Nguyen, Y. Lee, and S. Lee, “Structured light camera base 3d visual perception and tracking application system with robot grasping task,” in 2013 IEEE International Symposium on Assembly and Manufacturing (ISAM). IEEE, 2013, pp. 187–192. 36
Journal of Science and Technique - Le Quy Don Technical University - No. 210 (9-2020) [2] S. Levine, P. Pastor, A. Krizhevsky, J. Ibarz, and D. Quillen, “Learning hand-eye coordination for robotic grasping with deep learning and large-scale data collection,” The International Journal of Robotics Research, vol. 37, no. 4-5, pp. 421–436, 2018. [3] K. Pachtrachai, M. Allan, V. Pawar, S. Hailes, and D. Stoyanov, “Hand-eye calibration for robotic assisted minimally invasive surgery without a calibration object,” in 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 2016, pp. 2485–2491. [4] Z. Zhang, “Flexible camera calibration by viewing a plane from unknown orientations,” in Proceedings of the seventh ieee international conference on computer vision, vol. 1. Ieee, 1999, pp. 666–673. [5] J. Heikkila and O. Silven, “A four-step camera calibration procedure with implicit image correction,” in Proceedings of ieee computer society conference on computer vision and pattern recognition. IEEE, 1997, pp. 1106–1112. [6] J. Weng, P. Cohen, M. Herniou et al., “Camera calibration with distortion models and accuracy evaluation,” IEEE Transactions on pattern analysis and machine intelligence, vol. 14, no. 10, pp. 965–980, 1992. [7] K. H. Strobl and G. Hirzinger, “Optimal hand-eye calibration,” in 2006 IEEE/RSJ international conference on intelligent robots and systems. IEEE, 2006, pp. 4647–4653. [8] F. Dornaika and R. Horaud, “Simultaneous robot-world and hand-eye calibration,” IEEE transactions on Robotics and Automation, vol. 14, no. 4, pp. 617–622, 1998. [9] M. Li and D. Betsis, “Head-eye calibration,” in Proceedings of IEEE International Conference on Computer Vision. IEEE, 1995, pp. 40–45. [10] A. Li, L. Wang, and D. Wu, “Simultaneous robot-world and hand-eye calibration using dual-quaternions and kronecker product,” International Journal of Physical Sciences, vol. 5, no. 10, pp. 1530–1536, 2010. [11] H. Sung, S. Lee et al., “A robot-camera hand/eye self-calibration system using a planar target,” in IEEE ISR 2013. IEEE, 2013, pp. 1–4. [12] A. M. Andrew, “Multiple view geometry in computer vision,” Kybernetes, 2001. [13] R. Y. Tsai, R. K. Lenz et al., “A new technique for fully autonomous and efficient 3 d robotics hand/eye calibration,” IEEE Transactions on robotics and automation, vol. 5, no. 3, pp. 345–358, 1989. [14] J. C. Chou and M. Kamel, “Finding the position and orientation of a sensor on a robot manipulator using quaternions,” The international journal of robotics research, vol. 10, no. 3, pp. 240–254, 1991. [15] M. Shah, “Solving the robot-world/hand-eye calibration problem using the kronecker product,” Journal of Mechanisms and Robotics, vol. 5, no. 3, 2013. [16] H. Zhuang, Z. S. Roth, and R. Sudhakar, “Simultaneous robot/world and tool/flange calibration by solving homogeneous transformation equations of the form ax= yb,” IEEE Transactions on Robotics and Automation, vol. 10, no. 4, pp. 549–554, 1994. [17] I. Ali, O. Suominen, A. Gotchev, and E. R. Morales, “Methods for simultaneous robot-world-hand–eye calibration: A comparative study,” Sensors, vol. 19, no. 12, p. 2837, 2019. Manuscript received 20-2-2020; Accepted 14-5-2020. Nguyen Huu Hung received his Ph.D. degree at Sungkyunkwan University, South Korea in computer vision, in 2020. He is currently researcher at Institude of System Integration, Le Quy Don Technical University. His research interests include computer vision, simultaneous localization and mapping (SLAM), 3D point cloud processing, deep learning and AI. Nguyen Quang Thi received his Ph.D. degree at Changchun University of Science and Technology, China in Communication and Information System in 2014. He is currently lec- turer/researcher at Institute of System Integration, Le Quy Don Technical University, Vietnam. His research interests include computer vision, blind deconvolution, image processing and pattern recognition. 37
Section on Information and Communication Technology (ICT) - No. 15 (9-2020) Tran Cong Manh got his master-degree in computer science from Le Quy Don Technical University of Vietnam in 2007. In 2017 Manh got his PhD degree from Department of Computer Science, National Defense Academy, Japan. His current research interests include network security, intelligent computing, and data analysis. Currently, Dr. Manh works as a researcher in Le Quy Don Technical University, Hanoi, Vietnam. PHƯƠNG PHÁP ĐỒNG THỜI HIỆU CHỈNH THÔNG SỐ CỦA CAMERA VÀ TAY-MẮT Tóm tắt Xác định vị trí và hướng của các vật thể trong không gian làm việc của robot là một chức năng quan trọng của các hệ thống robot tự động hóa. Vấn đề này được giải quyết bằng cách sử dụng camera gắn trên cánh tay robot. Trong trường hợp này, một giải pháp khả thi để hiệu chỉnh hệ thống là cần thiết, bao gồm hiệu chuẩn các thông số camera cũng như hiệu chỉnh các tham số ma trận dịch chuyển của hệ thống robot-camera. Trong bài báo này, chúng tôi trình bày một phương pháp hiệu chỉnh hệ thống camera-on-hand một cách đơn giản và hiệu quả, trong đó các thông số của máy ảnh cũng như các thông số của ma trận chuyển vị được hiệu chỉnh đồng thời. Phương pháp này dựa trên việc kết hợp các phép chiếu 3D sang 2D của khối hiệu chuẩn với tối thiểu hai phép tịnh tiến và hai phép quay thuần túy của chuyển động cánh tay. Phương pháp đề xuất dược đánh giá trên dữ liệu mô phỏng và hệ thống camera-on-hand thật cho thấy rằng phương pháp của chúng tôi có thể hoạt động ổn định với nhiều mức độ nhiễu và số lượng vị trí robot khác nhau. 38